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Prologue 



Jaap Ponstein has been a professor of Operations Research at the University 
of Groningen for more than 20 years. The beginning of his career in Groningen 
coincided with the beginning of the study of econometrics in the university, in 
which OR was positioned. His broad experience as a mathematician, both pure 
and applied, appeared to be instrumental in the development of OR in Groningen. 
He became an authority on optimization, duality, convexity and generalized dif- 
ferentiability. His book "Approaches to the theory of optimization" (Cambridge 
University Press, 1980) is a beautiful example of the precise and transparent way 
in which he succeeded in connecting various areas of optimization. He was the 
right person to act as promotor honoris causa on the occasion of the honorary 
degree of R.T. Rockafellar (University of Washington, Seattle, USA) on June 20, 
1984. 

Professor Ponstein was known as an excellent teacher. For students and co- 
workers he was the ideal advisor: inspiring, and always making time to discuss 
papers. Usually, his comments were critical, both on contents and on formula- 
tions, but always aimed at improvement and maximal clarity. Hans Nieuwenhuis, 
Caspar Schweigman, Gerard Sierksma and I, to name a few, learned a great 
deal from him. We still are impressed by his integrity, his commitment, and his 
scientific erudition. 

In the eighties Imme van den Berg raised Jaap's interest in so-called Nonstan- 
dard Analysis. The possibility of dealing with infinitely small and infinitely large 
quantities as numbers, without losing mathematical rigor appealed to him. After 
his retirement (ultimo 1989) he made a profound study of this subject. The point 
of view of his research characterizes Jaap's scientific attitude: How to introduce 
infinitesimals and other "nonstandard" numbers naively and simplemindedly, but 
in a way such that the resulting theory is mathematically sound, and complete 
within obvious limits. 

This book is the result of his study. It was finished in the summer of 1995. 
Unfortunately, he was not able to publish it: he died on November 22, 1995. 
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I am pleased and proud, that the Faculty of Economics at the University of 
Groningen has offered the opportunity to publish the book. It underlines the 
respect we feel for Jaap's merits. 



Groningen, February 2001 



Wim Klein Haneveld 
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After Jaap had presented his inaugural speech my mother said to me: "I had no 
idea what he was talking about, but he presented it in such a way that I couldn't 
help thinking: Any minute now it will all make sense" . This was one of the special 
things about Jaap: his ability to inspire people with topics that appealed to him. It 
was the same when he explained to our small daughters, Marianne, Anne, Els and 
Ada, what a half and a quarter is. He cut an apple-pie first into two and then into 
four pieces. That same spirit of enthusiasm showed itself later on in Groningen 
when teaching econometrics students and when helping our neighbours' children 
with their "difficult maths". 

To my mind the above illustrates the way in which Jaap began working on this 
book. Infinitely small numbers and their applications fascinated him. Jaap under- 
stood how to address the subject in an unconventional way, in a way that would 
make the subject more accessible to others. He had extensive discussions with Els, 
which word, and why, would be the best to describe the infinitely small numbers. 
As the manuscript neared completion, he discussed with Anne the possibility of 
publishing it. 

That was the situation when Jaap became sick in September 1995. My daughters 
and I did not act on the manuscript for quite some time. We were unsure about 
what to do. Finally, I contacted Imme van den Berg who agreed to review the 
manuscript and bring it to the attention of the Faculty of Economics at the Uni- 
versity of Groningen. The Faculty, represented by Wim Klein Haneveld, offered 
to publish the manuscript as a volume of the internal Research Reports series of 
the SOM Research School. 

I would like to express my gratitude to Imme van den Berg for his rapid and 
thorough review of the manuscript, and to Wim Klein Haneveld for his careful 
handling the publication process. Last but not least, I would like to thank Suwarni 
Bambang Oetomo, who invested an enormous amount of effort into the task of 
transferring the manuscript from the word-processing program CW into the more 
modern TgX. It would not have been possible to publish this document without 
the financial assistance offered by the Faculty of Economics. My sincere thanks, 
also on behalf of our daughters, goes to all who have helped to bring about the 
publication of Jaap's manuscript. 



Zeist, November 2000 



Sileen Ponstein-Troelstra 
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Preface 



An infinitesimal is a 'number' that is smaller then each positive real number and 
is larger than each negative real number, so that in the real number system there 
is just one infinitesimal, i.e. zero. But most of the time only nonzero infinitesimals 
are of interest. This is related to the fact that when in the usual limit definition 
x is tending to c, most of the time only the values of x that are different from c 
are of interest. Hence the real number system has to be extended in some way or 
other in order to include all infinitesimals. 

This book is concerned with an attempt to introduce the infinitesimals and the 
other 'nonstandard' numbers in a naive, simpleminded way. Nevertheless, the 
resulting theory is hoped to be mathematically sound, and to be complete within 
obvious limits. Very likely, however, even if 'nonstandard analysis' is presented 
naively, we cannot do without the axiom of choice (there is a restricted version 
of nonstandard analysis, less elegant and less powerful, that does not need it). 
This is a pity, because this axiom is not obvious to every mathematician, and is 
even rejected by constructivistic mathematicians, which is not unreasonable as it 
does not tell us how the relevant choice could be made (except in simple cases, 
but then the axiom is not needed). 

The remaining basic assumptions that will be made would seem to be acceptable 
to many mathematicians, although they will be taken partly from formalistic 
mathematics - i.e. the usual logical principles, in particular the principle of the 
excluded third - as well as from constructivistic mathematics - i.e. that at the 
start of all of mathematics the natural numbers (in the classical sense of the term) 
are given to us. Not only the natural number, but also the set and the pair will be 
taken as primitive notions. The net effect of this is a version of mathematics that, 
except for truly nonstandard results, would seem to produce the same theorems 
as produced by classical mathematics. 

One of the consequences of combining ideas from the two main schools of math- 
ematical thinking is that the usual axioms of set theory, notably those due to 
Zermelo and Fraenkel, will be ignored. First of all, there will be elements that are 
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not sets, the natural numbers to begin with, only then sets will be formed from 
them in stages (or day by day), whereas when starting from the Zermelo-Fraenkel 
axioms each mathematical entity, in particular each natural number, is some set. 
From a formal point of view the latter has the advantage that there is just one 
primitive notion, but from a naive point of view it is not so obvious why numbers 
should be sets (in formalistic mathematics after the natural numbers come to life 
in the form of sets, this fact is concealed as soon as possible). Moreover, aren't 
we presupposing at least the order of the natural numbers already when writing 
down axioms by means of suitable symbols? 

To a certain extent nonstandard analysis is superfluous! For if a theorem of classi- 
cal mathematics has a nonstandard proof, it also has a classical proof (this follows 
from what in nonstandard analysis is known as the 'transfer' theorem). Often the 
nonstandard proof is intuitively more attractive, simpler and shorter, which is 
one of the reasons to be interested in nonstandard analysis at all. Another reason 
is that totally new mathematical models for all kinds of problems can be (and in 
the mean time have been) formulated when infinitesimals or other nonstandard 
numbers occur in such models. A trivial example is a problem involving a heap 
of sand containing very many grains of sand, but where the number of grains of 
sand must not be infinite. Then taking the inverse of some positive infinitesimal 
and rounding the result up or down produces a so-called infinitely large 'natural 
number' that is larger than each ordinary natural number, but is smaller than 
infinity. It can be manipulated in much the same way as the ordinary numbers, 
which cannot, of course, be said of infinity. As a consequence the mathematics 
of infinitely large sets is essentially simpler than that of infinite sets. A pecu- 
liarity, however, is that the 'selected' infinitesimal and hence the infinitely large 
natural number are not specified the way the number of elements of a set of, say, 
25 elements is specified. On the other hand, if uj is that infinitely large natural 
number, it makes sense to consider another heap of sand with uj 2 grains of sand, 
that can be thought of as the result of combining uj heaps of sand each containing 
w grains of sand. But in what follows the analysis of practical models containing 
nonstandard numbers will not be stressed. 



Chapter 1 
Generalities 



1.1 Infinitesimals and other nonstandard numbers: 
getting acquainted 

An infinitesimal is a number that is smaller than every positive real number and 
is larger than every negative real number, or, equivalently, in absolute value it 
is smaller than 1/m for all m e INT = {1, 2, 3, . . .}. Zero is the only real number 
that at the same time is an infinitesimal, so that the nonzero infinitesimals do 
not occur in classical mathematics. Yet, they can be treated in much the same 
way as can the classical numbers. For example, each nonzero infinitesimal e can 
be inverted and the result is the number uj — 1/e. It follows that | w |> m for 
all m e N, for which reason uj is called (positive or negative) hyperlarge (or 
infinitely large). Hyperlarge numbers too do not occur in classical mathematics, 
but nevertheless can be treated like classical numbers. If, for example, uj is positive 
hyperlarge, we can compute s/uj, uj/2, uj — 1, uj + 1, 2uj, uj 2 , etc., and we have 
(uj - 1) + (uj + 1) = 2uj, (uj - 1) • (uj + 1) = uj 2 - 1, etc. Also, for all m e N, 

m < ^fuo < uj/2 <uj-1<lu<uj + 1<2uj<uj 2 

giving seven different hyperlarge numbers. The positive hyperlarge numbers must 
not be confused with infinity (oo), which should not be regarded a number at all, 
and which anyway does not satisfy these inequalities, except the first one. 

Regrettably, there does not seem to exist a synonym for 'hyperlarge number' 
that would make a nice pair with 'infinitesimal', so let us introduce the synonym 
l hypersmall number' for the latter. 

If e is hypersmall, if 5 too is hypersmall but nonzero, and if uj is positive hyper- 
large, so that — uj is negative hyperlarge, we write, 

e ~ 0, 5 ~ 0, uj ~ oo, — uj ~ — oo respectively. 
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It would be wrong, of course, to deduce from u ~ oo that the difference between 
uj and oo, or that between — u and — oo would be hypersmall. 

Given any x G R, x 7^ 0, and any 5 ~ 0, let t = x + S, then, 

e <| t |< u, 

for all e ~ and all ~ oo. The number t is called appreciable (as it is not too 
small and not too large). 

Three nonoverlapping sets of numbers (old or new) can now be presented: 

a) the set of all infinitesimals, to which zero belongs, 

b) the set of all appreciable numbers, to which all nonzero reals belong, and 

c) the set of all hyperlarge numbers, containing no classical numbers at all. 

Together these three sets constitute the set of all numbers of 'real nonstandard 
analysis'. This set, which clearly is an extension of R is indicated by, 

*R 

and is called the *-transform of R. The elements of *R are called hyperreal. The 
use of the prefix 'hyper' here is not entirely defendable, as, say, 5, which obviously 
is an element of *R, is just an ordinary real. 

Abbreviating hypersmall, appreciable, and hyperlarge to s, a and /, respectively, 
and assuming that x and y are positive numbers, for addition and multiplication 
the following holds, 
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addition multiplication 

where the quotation marks stand for s or a or I. Examples for the lower left 
quotation mark are x ~ and y = Vx" 1 , or 1/x, or 1/x 2 . For x — y the results 
are the same as for x + y (if still x, y > 0), except that if both x and y are 
appreciable, then x — y is either hypersmall or appreciable, and that if both x 
and y are hyperlarge, then x — y is either hyperlarge (positive or negative), or 
appreciable, or hypersmall, as is shown by the following examples: y = x/2, or 
2x, or x — 1, or x + e, with e ~ 0. 

If a number is not hyperlarge it is called finite or limited. 
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Remark: Elsewhere in the literature, any element of *R is called finite. 

Clearly, t is finite if and only if t — x + e for some x G 1R and some e ~ 0. Given 
such a t, both x and e are unique, for, 

x + e = y + 5, x, y G R, e,5~0 

implies that x — y = 5 — e~0, so that (as x — y G R), x — y = 0, hence x = y 
and e — 5. By definition x is called the standard part of t, and this is written as, 

x = st(t). 

The standard part function st provides an important (mainly one-way) bridge 
between the finite numbers of nonstandard analysis and the classical numbers. 
Trivially, if t is itself a classical number, then st(t) = t. 



1.2 Other *-transforms; generating new numbers 

The *-transform not only can be obtained for R but also for IN, Z, (Q, and in fact 
any set X of classical mathematics (and for much more, see Section 1.5). Their 
*-transforms are indicated by *N, *Z, *Q, and *X, respectively. Throwing all non- 
finite numbers out of *N and *7L we obtain again N and Z, but something similar 
is not true for *Q (for *R we know this already), simply because *Q (just as *R) 
contains finite non-classical numbers. Yet there is a striking difference between 
*Q and *R in this respect: the 'standard part theorem' discussed at the end of 
the preceding section does not hold for *Q, that is to say, there are finite elements 
t of *Q that cannot be written as t = x + e, with x G Q, e G *Q, e ~ 0. For 
let c be any irrational number, say c = y/2, and let (r 1? r 2 , . . .) be some Cauchy 
sequence of rationals converging to c. Later on it will become clear that then 
the sequence (r x — c, r 2 — c, . . .) 'generates' an infinitesimal 5 in *R (because this 
sequence converges to zero). On the other hand (ri,r 2 , . . .) generates an element 
r G *Q C *R, and r is finite (because the r\ are rational, and this sequence 
converges), but it has no standard part in Q, for otherwise r = x + e for some 
x G Q and some e G *Q, e — 0. But (r\ — c,r 2 — c, . . .) also generates the finite 
number r — c G *R, so that r — c = 5 ~ 0. It follows that x — c = 5 — e ~ 0, 
hence x — c = (as x — c is an ordinary real), which would mean that c G Q, a 
contradiction. On the other hand, in *R we have that st(r) = c. (Carrying this 
argument further it turns out that there exists a 1 — 1 mapping between R and 
the set of all finite elements of *Q modulo the set of all rational infinitesimals, 
preserving addition and multiplication; i.e. the mapping is an isomorphism. In 
other words, R (not *R) can in a sense be produced by *Q.) 
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There are various ways to introduce the new numbers. Below this will be done by 
means of infinite sequences of classical numbers. In particular, the elements of *R 
will be generated by means of infinite sequences of reals, and it will be necessary 
to consider all such sequences. (Recall that the elements of R can be generated by 
means of rather special infinite sequences of rationals, i.e. the Cauchy sequences.) 
More generally, given any classical set X the elements of its *-transform X will 
be generated by means of infinite sequences of elements of X, and again all such 
sequences must be taken into account. Each such sequence 'generates' an ele- 
ment of *X, and in case X is a set of numbers (or n-tuples of numbers) special 
sequences generate the elements of X itself. For example, (1, 2, 3, . . .) generates a 
hyperlarge element of *1N, and (3/2, 5/4, 9/8, . . .) generates a finite element of *Q, 
that is equal to the sum of 1, generated by (1,1,1,...) and an infinitesimal, gen- 
erated by (1/2, 1/4, 1/8, . . .). Different sequences may generate the same element 
of *X. In fact, given any x G X there are many (uncountably many) different 
sequences that generate x (if X contains at least two elements). For example, 
changing finitely many terms of a generating sequence has no effect on the ele- 
ment generated. But there are many more variations on this theme. Wouldn't it 
be possible to restrict ourselves to a suitable subset of all sequences? Unless we 
are satisfied with some sort of mutilated nonstandard analysis, most likely the 
answer is 'no'. See Section 4.4. 

Anyway, the nuisance of having to use generating sequences is only temporary. 
Once the new numbers have been introduced (as well as new functions, etc.) in 
most cases it is not necessary at all to know that they came about by means of 
infinite sequences. The situation is entirely analogous to that of introducing the 
real numbers: most of real analysis can be developed without the interference of 
Cauchy sequences. Most of the time an irrational such as \[2 is treated as just a 
number, not as a sequence. 

Although *1N, *Z, *Q and *R are extensions of IN, Z, Q and R, respectively, in 
general X is not always an extension of X. If, for example X = {IN}, then 
X = {*1N}, and since IN ^ *1N, X is not contained in X. 



1.3 Bound and free variables; prenex normal form 

At this point we must interrupt the main subject to say a few words about the 
formulation of mathematical statements and the difference between bound and 
free variables occurring in them. It will be convenient to use the logical connectives 
-i, A, V, =>-, (for NOT, AND, OR, IMPLIES, IS EQUIVALENT TO, respectively), 
the universal quantifier V and the existential quantifier ( l \/x : . . .' means: 'for all 
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x such that . . .', and 'Eb : . . .' means 'there exists an x such that ...'). Apart 
from operations, logical connectives and quantifiers, each mathematical statement 
contains a number of variables and constants, which may be (expressions of) 
numbers, or functions, or n-tuples of numbers or functions, or sets of numbers or 
functions, etc. 

For example, in, 

3x : [x G R A x 2 + y = 0], 

the constants are R, 2, and and the variables are x and y (unless y is fixed; then 
it may be regarded as an unspecified constant). Clearly, this statement changes 
into another meaningful statement if y is replaced by another variable or by 
some specified constant, and the same is true if one or more of the constants are 
replaced by other constants or by variables, at least within reasonable limits: 

3x : [x G RAx 2 + z = 0], 

or, 

3x : [xGQAi 2 + !/ = 0]. 

Obviously, the new statement may not be equivalent to the given one, but that 
is not important in the present discussion (in fact the last statement is different 
from the given one, and the last but one is in case z is a variable that is different 
from the variable y). On the other hand, (only) replacing i by a constant leads 
to nonsense. Moreover, replacing the symbol 'x' by, say, V leads to an equivalent 
statement: 

3z : [z G HAz 2 + y = 0], 

for it does not matter what symbol is used to indicate the relevant variable (again 
within reasonable limits). For these reasons, with respect to the given statement 
x is called a bound variable, and y is called a free variable. More generally, given 
any statement, if replacing any variable occurring in it by some constant leads to 
another meaningful statement that variable is called a free variable with respect 
to it, and any other variable occurring in it is called a bound variable (or a 
dummy variable) with respect to it. Any constant might be called free as well. It 
follows that the truth value of a statement depends on its free variables and its 
constants, but not on its bound variables. 

Furthermore, it follows that the x in 'W or in '3a;' is a bound variable, but the 
converse does not seem to be true. For example, in a limit definition the clause 
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'for x tending to c' may be occur. Then x is a bound variable. And x may also 
be a bound variable in the definition of an integral, namely if 'cfe' occurs in the 
well-known way. If limits and integrals are replaced by their definitions, however, 
x will be 'bound' to a quantifier. How about x in, 

Many people will interpret this as Vi : [x G R =>- x 2 > 0], and then x is a 
bound variable, yet a; is a free variable simply because, say, 5 G R =>- 5 2 > 0, is a 
meaningful (though uninteresting) statement. Note that many theorems take the 
form: 'if x satisfies . . ., then . . .', which usually is meant to be read as: 'for all x 
such that . . ., it is true that . . .'. 

Some confusion may arise because of the possibility to indicate different bound 
variables by the same symbol (with free variables this is not possible), as in, 

[Vi:[iGR^j: 2 >0]]A[3j::[iGRAi 2 + 1 = 0]] 

which is equivalent to, 

[Vx : [x G R =>• x 2 > 0]] A [3y : [y G R A y 2 + 1 = 0]]. 

Let us agree to avoid this, and in this example reject the first formulation. 

Fortunately, any statement can be rewritten in its so-called prenex normal form. 
This is a unique rearrangement of the statement, whereby the quantifiers precede 
the logical connectives. Details regarding the existence and the uniqueness of the 
prenex normal form can be found in books on formal logic. For example, the 
following statement is in prenex normal form, 

Vx : 3y : : P(c, d, . . . , s, t, . . . , x, y, z), 

where P(c, d, ... ,s,t, ... ,x,y, z) is a statement that contains no quantifiers at all, 
and apart from its free (!) variables x, y, and z, contains the constants c,d,... 
and other free variables s,t, . . . , as well as logical connectives. It is clear that 
with three quantifiers there are eight possible normal forms if everything except 
the quantifiers is ignored: VW, W3, V3V, 3W, V33, 3V3, 33V, 333. 

It is better, however, to consider the following variations, where for clarity in P 
there is only one constant c and only one additional free variable s, 

Wx G X : 3y G Y : \/z G Z : P(c, s, x, y, z), 

where X, Y and z G Z are properly selected sets, or, 

G X, A(x) : 3y G Y, B(x, y) : G Z, C(x, y, z) : P(c, s, x, y, z), 
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where A(x), B(x,y), and C(x,y,x) are relatively simple conditions regarding 
the bound variables involved (such as x > 0), conditions that do not contain 
quantifiers. This is to be read as: 

for all x in X satisfying A(x) it is true that there exists a y in Y 
satisfying B(x,y) such that for all z in Z satisfying C(x,y,z) it is 
true that P(c, s, x, y, z). 

The reason why set inclusions should be specified explicitly is to avoid certain 
errors in nonstandard analysis, errors that cannot occur in classical mathematics. 
Even if a; is a subset of a larger set Y, it is better to replace l x C V by l x G V(Y)\ 
where V(Y) is the power set of Y, i.e. the set of all subsets of Y. This will become 
clear when discussing the *-transform in Section 1.5, and the difference between 
internal and external constants in Section 1.6. 



1.4 The purpose of nonstandard analysis 

After the digression of the preceding section let us now contemplate the purpose 
of nonstandard analysis. Starting from IN, the sets Z, Q and R (and C, but below 
complex numbers will be ignored) have been introduced in classical mathematics 
in order to enrich mathematics with more tools and to refine existing tools. The 
introduction of negative numbers, of fractions, and of irrational numbers is felt 
as a strong necessity, and without it mathematics would only be a small portion 
of what it actually is. The introduction of *N, *Z, *Q, and *R, however, was not 
meant at all to enrich mathematics (at least not when it all started), but only 
to simplify doing mathematics. For as soon as notions like limit and continuity 
are involved, definitions in nonstandard analysis can be given a simpler form, 
and theorems can be proved in a simpler way. Often the simplifications are con- 
siderable. In one case the proof of a classical conjecture was found by means of 
nonstandard analysis, after which a classical proof was found as well. Moreover, 
both definitions and proofs receive a more natural appearance. This may even 
enhance the discovery of new facts. 

In the mean time nonstandard analysis has also been applied in a more tradi- 
tional way, namely to introduce new mathematical notions and models. Examples 
can be found in probability theory, asymptotic analysis, mathematical physics, 
economics, etc. In what follows the attention will primarily be focused, however, 
on simplifying mathematics, rather than on enriching it with new concepts. 
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As an example of a simpler definition, consider continuity. A function / from R 
tot R is continuous at c G R if statement (1.1) holds, 

G R, e > : 35 G R, 5 > : G R, | x - c \< 5 :| /(a;) - /(c) |< e. 

Now to / there corresponds a unique function *F, called the *-transform of /, 
that is a function from *R to *R, such that *f(x) = f(x) if x G R, and (1.1) is 
true if and only if (1.1), which is the *-transform of (1.1), is true, 

Ve G *R, e > : 35 G *R, 5 > : G *R, | x - c |< 5 :| */(a:) - 7(c) |< e. 

(More about the *-transform in the next section.) 

Moreover, (1.1) is equivalent to the much simpler statement (1.1), 

V5 G *R, 5 ~ : */(c + 5) - */(c) ~ 0. 

Warning: The equivalence between (1.1) and (1.1) does in general not hold if c is 
replaced by a nonstandard number, or if / is replaced by a nonstandard function. 

The essence of (1.1) is, 

5-0 ^*f( c + 5) -7(c) ~0, 

which is precisely what we want the definition of continuity to be: if x — c = 5 is 
infinitely close to zero, then f(x) — /(c) too should be infinitely close to zero. The 
only problem in classical mathematics is that 'infinitely close to' is not (and most 
likely will never be) a well defined notion. In nonstandard analysis, however, all 
that need be done is to replace 'infinitely close to' by '~ 0'. 

Note that in all four definitions 5 plays the same role (i.e. 'distance' from c), but 
that in (1.1) and (1.1) it is bound to 3, whereas in Q it is bound to V. Also note 
that (1.1) and (1.1) each contain three quantifiers, but that (1.1) contains only 
one (that (1.1) contains no quantifiers at all is because it really is not complete 
as '5' is missing). 

An illustration of a simpler proof is that of the intermediate value theorem: if 
/ : R — > R is continuous in the closed interval [a,b], a < b, a and b both finite, 
and /(a) < 0, f(b) > 0, then /(c) = for some c G [a, b]. 

A nonstandard proof of this theorem proceeds as follows. Let m G *N be hy- 
perlarge. Divide [a, b] in m equal subintervals, each of length 5 = {b — a)/m. 
Then 5 ~ 0. Let n be the smallest element of *N such that *f(a + n8) > 0, then 
7 (a + (n — 1)5) < 0. Let c = st(a + n5), then, by continuity, 

7(a + n5) - 7(c) = ei and 7(c) - *f(a + (n - 1)5) = e 2 , 
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for certain infinitesimals S\ and e 2 - Hence — £1 < /(c) = */(c) < £2- But /(c) G R, 
so /(c) = 0. 

How come, dividing [a, b] in m subintervals if m is not finite, and assuming that n, 
which also is not finite, exists? Yes, this is all right, because hyperlarge numbers 
behave like classical numbers. 

The classical proof of the theorem is more involved, because it is based on the 
fact that a nonempty subset of R that is bounded above has a least upper bound 
(or supremum). 

Exercise: Show this fact by means of nonstandard analysis. 



1.5 More about the *-transform; transfer 

So far a number of isolated instances of *-transforms have been presented (the 
*-transform of R and other sets, of functions from R to R and of statement 
(1.1) in the preceding section). Although it is too early to present a complete 
treatment of the *-transform, a number of interesting aspects of this notion may 
be discussed already now. 

To each number, each set, each function, each operation (such as + and U), each 
simple relation (such as < and e), each logical connective (-1, V, A, =>-, 44>), 
both quantifiers (V, 3), each definition, and each statement of classical math- 
ematics, there corresponds a unique *-transform in nonstandard mathematics. 
The notation is quite simple: just add an asterisk to the upper left of the symbol 
representing what is to be transformed. Sometimes the *-transform is identical to 
its inverse image, but often this is not so. In the former case the asterisk should, 
of course, be dropped, but even in the latter case this can sometimes be done 
without creating confusion. Below a number of typical examples is presented, but 
full details will only be given later on. 

a) Numbers. If x e R, then *x = x. 

b) Sets. If X is a finite set of numbers, then X = X, and also (happily so) 
*0 = 0, but if X is an infinite set of numbers, then X is strictly included 
in X (in case X is an arbitrary abstract set and X ^ X, X need not be 
a subset of X.) 

c) Pairs. If (x,y) is a pair, then *(x,y) = (x*,y*), and similarly for n-tuples 

(x\ , • • • , Xji) . 
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d) Functions. If / : X -> Y, then */ : X -> *F, and = /(x) ifiGl. 
Often the asterisk in */ may be dropped. 

e) Operations. As an example consider addition in R. Its ^-transform is *- 
addition in *R, and x *+ y = x + y if x, y G R. The asterisk can safely be 
dropped. 

f) Atomic relations. These are relations in which neither logical connectives 
nor quantifiers play a part, but only such relations as < or G, etc. Consider 
first < in R, leading to * < in *R. Similarly as under e) we have that x 
*< y is equivalent to x < y if x, y G R, and again the asterisk can safely 
be dropped. Next consider set inclusion. Let X be a subset of R, then 
G X transforms to * G *X. But ordinary set inclusion too is, of course, 
applicable to *X, so that there would be two set inclusions for the *- 
transform X of X. Fortunately, the two are identical, so that dropping 
the asterisk is a must. 

g) The logical connectives, and both quantifiers. For all of them the *- 
transform is identical to the inverse image, so that asterisks should be 
dropped. 

h) Definitions. For example continuity transforms to *-continuity and */, in- 
troduced in Section 1.4, is *-continuous at c if (1.1) is true. 

i) Statements. To some extent this covers, of course, case h). To find the *- 
transform of a statement (1.1), it should be formulated in such a way that 
each bound variable x occurs in some set inclusion of the form x G X, not 
x C Y . Then the *-transform is obtained by replacing each constant and 
each free variable by its *-transform. As an example consider (1.1) and 
(1.1) defined in Section 1.4. Note that if in (1.1) 'G R' would have been 
left out, something different would have been obtained. This is one of the 
reasons why at the end of Section 1.3 it was suggested to explicitly include 
each bound variable in some set inclusion. Why the inclusion x C Y should 
be avoided will become clear in the next section. 

One of the basic principles of nonstandard analysis is that any given classical 
statement (1.1) is true if and only if its ^-transform is true, which results from 
(1.1) by replacing all its constants and free variables by their *-transforms. Note 
that the bound variables are not replaced. The principle is applied both ways, 
from R to *R, or from *R to R. In either case one says that the deduction is 
done by 'transfer'. Assuming that everything is in prenex normal form, two simple 
nontrivial cases are, 



Wx G X : P(x, s) and 3x G X : P(x, s), 
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where X is some set and P(x, s) is some atomic substatement with x a free 
variable and s a constant or a free variable. The *-transforms are, 

Mx e*X : P(x, *s) and 3x G *X : P(x, *s), respectively. 

Clearly, for each of the two classical statements transfer is trivial in one direction, 
assuming that X is a subset of *X, but not necessarily in the opposite direction. 
The following two implications are the nontrivial ones, 

[VxGl: P(x, s)] [Vx G *X : P(x, *s)\, 

[3x G X : P(x, *x)\ [3x G X : P(x, s)]. 

Note that the first implication starts from a classical statement and leads to its 
*-transform, whereas the second one starts from a *-transform and leads to the 
corresponding classical statement. 

In by far the most practical situations applying transfer is fairly obvious. In what 
follows transfer is applied in a slightly complicated situation, where it is required 
to show the equivalence of statements (1.1) and (1.1), as well as that (1.1) can be 
simplified to Q, with (1.1), (1-1) and (1.1) as in Section 1.4. Trivially, by transfer, 
(1.1) and (1.1) are equivalent, so it remains to show the equivalence of (1.1) and 
(1.1). 

a) Let (1.1) be true, and let e G R, e < 0, and 8 G *R, 8 ~ be arbitrary. 
Then for some 8' G R, 8' > 0, 

G R, | x - c \ < 8' :| f(x) - /(c) \< e, 

hence, by transfer, and because *c = c, *e = e, *8' = 5', */(c) = f(c), 

Vx G *R, | x - c |< 5' :| *f(x) - 7(c) |< e. 

Let x = c + 8, then, because by definition of infinitesimal | 8 \ < 8', | */(c + 
5)—*f(c) \< e. But since e is arbitrary, this means that *f(c+5)— */(c) ~ 0, 
and since 8 is arbitrary that (1.1) is true. 

b) Conversely, let (1.1) be true, and let e G R, e > 0, and 8 G *R, 8 ~ 0, 
8 > 0, be arbitrary. For each x G *R, | x — c |< 5 it follows that x — c = 8' 
for some 8' ~ 0, hence, by (1.1), *f(x) — *f(c) ~ 0, and, by definition of 
infinitesimal, | *f(x) — *f(c) |< e. Apparently, 

38" G *R, 8" > : Va; G *R, | rr - c |< 5" :| */(x) - 7(c) |< e, 

(take, for example, 8" = 8), hence, by transfer (in the opposite direction 
as under a)), 

38" G R, 8" > : \/x G R, | x - c |< 8" :| /(x) - /(c) |< e. 
Since is arbitrary this proves (1.1). □ 
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1.6 Standard, internal, and external constants 

Each *-transform is called standard, because it corresponds in a 1 — 1 way to some 
classical constant, and in a number of cases is even identical to that constant. In 
particular any set *X is standard. For example, *N, *R, and *P(R), with V(R) 
the power set of R, are all standard. Note that the term 'standard' must not 
be used within classical mathematics. The reason for this is that, for example, 
a function / from R to R, regarded as a function of nonstandard analysis, may 
not be standard, and usually it isn't. On the other hand, */ is standard, but this 
is a function from *R to *R. 

Each element (not subset) of a standard set turns out to be a special kind of 
constant of nonstandard analysis, namely a so-called internal constant, so that 
among others, infinitesimals and hyperlarge numbers are internal, because they 
are elements of *R. Also the classical reals are internal, as R C *R. Since *x = x 
if x G R, the reals are also standard (as ingredients of nonstandard analysis). 
More generally, each standard constant happens to be internal, but the converse 
is not true, as is exemplified by infinitesimals and hyperlarge numbers. 

Not every constant of nonstandard analysis is internal. For example, neither R, 
nor the set of all infinitesimals, nor the set of all hyperlarge numbers is internal. 
Any constant that is not internal is called external. 

Whereas internal constants behave like classical constants, external constants do 
not. They have extraordinary properties. For example, although R is a subset 
of *R that is bounded above in *R by any positive hyperlarge number, R has 
no least upper bound in *R. For if b would be such a bound, then b ~ oo, and 
6 — 1 too would be an upper bound, but b — 1 < b. In a similar way it can 
be shown that the (bounded) set of all infinitesimals has neither a least upper 
bound nor a largest lower bound; and that *N\N, the set of all hyperlarge natural 
numbers, has no smallest element, whereas each nonempty subset of IN has such 
an element. Therefore, working with external constants or external variables that 
are not recognized as such is rather dangerous. Fortunately, the occurrence of 
external variables can be avoided by explicitly using in classical statements set 
inclusions of the type x E X (not x C Y), because then their *-transforms contain 
the inclusions x G *X, and x is automatically internal. If an inclusion like x C Y 
is involved, it should be replaced by x G V(Y), because an arbitrary subset of *Y 
need not be internal, as we have already seen. In this way external variables can 
literally be kept out of nonstandard analysis. 

For any constant or variable, the diagram below shows the three main possibilities, 
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standard and internal 


nonstandard and internal 


nonstandard and external 



'Internal' may be read as 'mildly non-standard', and 'external' as 'extremely non- 
standard'. 



1.7 Infinitesimals in Greek geometry? 

Maybe it was Antiphon, a Greek mathematician and contemporary of Socrates, 
who for the first time contemplated the existence of infinitesimals. According to 
Heath [1] he, Antiphon, stated that, in Heath's words: 

"If one inscribed any regular polygon, say a square, in a circle, then 
inscribed an octagon by constructing isosceles triangles in the four 
segments, then inscribed isosceles triangles in the remaining eight 
segments, and so on 'until the whole area of the circle was by 
this means exhausted, a polygon would thus be inscribed whose 
sides, in consequence of their smallness, would coincide with the 
circumference of the circle'." 

There are at least two interpretations in modern terminology of this. One is 
that the end product of Antiphon's construction is a polygon with a hyperlarge 
number of sides, so that the length of each side is a positive infinitesimal. But 
this would imply that the end product would not coincide with the circumference 
of the circle, that is, not exactly. The other one is that the end product is the 
circumference of the circle itself. But this would imply that the end product 
no longer was a polygon. Either interpretation contains a contradiction, so it is 
difficult to say what really was in Antiphon's mind. 

Anyway, Antiphon's idea was not accepted by his fellow mathematicians. Again 
in Heath's words: 

"The time had, in fact, not come for the acceptance of Antiphon's 
idea, and, perhaps as the result of the dialectic disputes to which 
the notion of the infinite gave rise, the Greek geometers shrank 
from the use of such expressions as infinitely great and infinitely 
small and substituted the idea of things greater or less than any 
assigned magnitude. Thus, as Hankel says, they never said that a 
circle is a polygon with an infinite number of infinitely small sides; 
they always stood still before the abyss of the infinite and never 
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ventured to overstep the bounds of dear conceptions. They never 
spoke of an infinitely close approximation or a limiting value of the 
sum of a series extending to an infinite number of terms." 

Note that the two interpretations mentioned above are also present in this quo- 
tation ('infinitely close' and 'limiting value'). 

Nevertheless, the Greek geometers solved many problems involving limits. They 
managed to do so by means of the so-called method of exhaustion. Given the 
problem to determine, say, the area of some figure, it is the method to find a 
sequence of inscribed figures as well as a sequence of circumscribed figures, each 
of known area, such that the given figure is approximated better and better by the 
terms of either sequence. But this does not mean that they thought in terms of 
limits. From the areas of the terms of both sequences they derived (guessed?) the 
area of the given figure, and a rigorous proof was obtained by showing that the 
proposed area of the given figure always lied between the areas of corresponding 
terms of both sequences. All that we can criticize is that they took the existence 
of the desired area for granted. In fact they managed to determine many limits 
without ever presenting a definition of limit. 

Perhaps in his 'Methods' Archimedes comes closer to the use of infinitesimals. For 
example (see [1], Supplement, p. 15), when showing that the area of a segment 
ABC of a given parabola is 4/3 of the area of the triangle ABC, if, with D 
the middle point of the chord AC, BD is parallel to the axis of the parabola, 
Archimedes begins with some sort of plausible reasoning, where he states that 
the segment is made up of line segments between the parabola and the chord of 
the segment, all parallel to the axis of the parabola. Apparently, in his mind all 
these line segments together make up the entire segment of the parabola. It is 
tempting to conclude that the line segments were treated by him as parallelograms 
of hypersmall but positive breadth. At any rate, Heath ([1], Supplement, p. 8) 
writes that the line segments are 

"... of course ... indefinitely narrow strips (areas) . . .; but the 
breadth . . . (dx, as we might call it) does not enter into the calcula- 
tion because it is regarded the same in each of the two correspond- 
ing elements which are separately weighed against each other, and 
therefore divides out." 

If this would be correct Archimedese would have continued his plausible reasoning 
by showing that the parallelograms could each be 'weighed' (letting the area of 
a parallelogram be its weight) against one of the parallelograms making up a 
certain figure F. But the area of F could easily be shown to be equal to 4/3 of 
the area of triangle ABC. 
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There is an alternative, however, similar to the second interpretation mentioned 
earlier when discussing Antiphon's idea, where not parallelograms but line seg- 
ments are weighed against each other (letting the length of a line segment be its 
weight). In fact Archimedes neither mentioned something like breadth, nor dis- 
cussed dividing something out at all. Instead, he considered line segments making 
up certain areas, not thin parallelograms. True, in this case the number of line 
segments is infinite, so a limit is involved, but when working with parallelograms 
each individual comparison of weights is not exact. And since (as Archimedes re- 
marks himself) the reasoning is not to be regarded as a rigorous one, it is not clear 
which interpretation is the right one. Anyway, Archimedes later on presented a 
rigorous proof - based on the method of exhaustion - where he could use the 
ratio 4/3 that he found by plausible reasoning. 

Let us close this discussion with Heath's remark that Archimedes' 'Method' is 
a rare instance where a Greek mathematician shows how his intuition has led 
him to the solution of some problem by means of plausible reasoning. Usually, 
in Greek mathematics any trace of the intuitive machinery used was completely 
cleared away. 

Open question: Have infinitesimals been wandering through the minds of some 
Greek mathematicians, or didn't they? 



1.8 Infinitesimals in the 17th to the 19th century 

There can be no doubt that in the 1670's, some 1900 years after Archimedes lived, 
infinitesimals were conceived by Leibniz. Moreover, he formulated their main 
properties, and many contemporary mathematicians as well as mathematicians 
after him, among them Euler and Cauchy, were able to successfully work with 
them. But the theory of the infinitesimals lacked a rigorous basis, and during 
some 200 years all trials to improve this situation were in vein, so that at last one 
gave up, the more so because in the 1870's Weierstrass came up with a rigorous 
theory of limits and continuity, which became the basis of what now is known as 
classical analysis, and where there was and is no need to consider infinitesimals 
any more. 

It is quite interesting to see how Euler [2] shows the well-known product formula 
for the sine function. He begins his proof with the equality, 

2 • sinh x = (1 + x/n) n - (1 - x/n) n , 

valid for - in Eulers's own words - 'infinitely large values' of n. Obviously, this is 
only true up to an infinitesimal. Then the right-hand side is treated as if n were 
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a classical natural number. This leads after a purely classical reasoning to, 
(1 + x/n) n - (1 - x/n) n = (8x/n) ■ ]J sm 2 (kn/n) • {1 + x 2 /n 2 tan(A;vr/n)}, 

k=l 

where m = (n — l)/2, taking n odd (the details of the reasoning do not matter 
here, and the case for n even is similar). So, 

m 

sinh x = (Ax/n) ■ sm 2 (kn/n) ■ {1 + x 2 /n 2 tan 2 (kir / n)} . 
fe=i 

Taking rr 7^ 0, and dividing by x, and then taking x — 0, gives, 

m 

1 = (4/n) • JJ sin 2 (A;7r/n), 
fc=i 

and hence, 

m 

sinh a; = x • JJ{1 + x 2 /n 2 tan 2 (A;7r/n)}. 
fc=i 

Now for finite, n 2 tan 2 (/c7r/n) is 'infinitely close' to (A;7r) 2 , so (?) 

00 

sinh x = x ■ + x 2 /A; 2 7r 2 }, 

fc=i 

and putting x = iz, this gives the desired result, 

00 

sinz = z- Y[{l-z 2 /k 2 n 2 }. 

fe=i 

Obviously, at the question mark the argument goes a little too fast, and a number 
of steps must be included here (see e.g. Luxemburg [3]). 

Another famous example is Cauchy's proof ([4], p. 131), that a convergent series 
of continuous functions has a continuous limit function. To many this theorem 
was not correct, because it would seem that all kinds of counter-examples could 
be given. One of them is the series with the partial sums, 

" sin(2A; + l)x 
«.(«) = («/r)-g afc + 1 ■ 

that is periodic modulo 2n and converges to, 

{-1 if -7T < x < 
if x = 0orx = 7r 
+ 1 if < X < 7T 
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as can be shown by classical Fourier analysis. Since the sine function is everywhere 
continuous and s n (x) converges to f(x) for n tending to oo, according to Cauchy's 
theorem / ought to be continuous, which it isn't. But sofar, everything takes place 
within R, and Cauchy let everything happen in what we have indicated by *R. 

For him continuity of / at c meant that, 

Vrr G *R, x ~ c : f(x) ~ /(c), 

where, however, / : *R — > *R and / need not be a standard function, and c G *R, 
not only c G R, which is why his continuity is not * continuity (in nonstandard 
analysis it is called ^-continuity; recall definition (1.1) in Section 1.4, where c G R 
and a standard function was involved, so that there ^-continuity was the same 
as * continuity). 

And by convergence of s n (x) to /(c) he meant that, 

Vn ~ oo : s n (c) ~ /(c), 

where again everything is in *R. Note that the Weierstrassian definitions of limit 
and continuity appeared half a century after Cauchy's book, so Cauchy in a sense 
'had to' work with definitions of the kind given here. 

Now, by transfer, 

* / n / a i \ * sin(2fc + l)x +mT 

*s n (x) = (4/tt) • o; , i ' w€ N ' x e R ' 

k=i 2k + 1 



and 



*f(x) = -1, or 0, or + 1, x G *R, 



since if the range of a classical function / is finite, the range of its transform is 
the same as that of /. 

Let m ~ oo be fixed, and let x — c — I /(2m), and dt = 1/m, so that x ~ 0, 
dt ~ 0. Then, 

* ( 1 \ ( c\ i \ sin(2A; + l)dt/2 . 

H^H^'S (2fc + l)A/2 

If we had that m G IN, then the sum to the right would be an approximation of 
the Riemann-integral, 

1 sint 



f 1 sint , 

J= / di, 

Jo t 
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and it should therefore not come as a surprise that it can be shown that the 
standard part of the right-hand side is exactly equal to 2J/n, and hence, 



But by direct calculation it follows that 2J/ix ^ —1, 0, and +1, and since in 
particular for c = l/(2m), */(c) = —1, or 0, or +1 (—1 is in fact impossible), 
it follows that *s„(c) does not converge to *f(c). Also, since for all n, *s„(0) = 0, 
*s m (x) is not continuous at c = 0, so that the 'counter-example' does not satisfy 
the assumptions of Cauchy's theorem, and this is why Cauchy maintained his 
theorem against all criticism, but without basing his proof (and much of his 
other work) on a rigorous theory of the infinitesimals and other nonstandard 
numbers. For many interesting details, see Lakatos [5]. 



1.9 Infinitesimals in the 20th century 

When in the 1870's Weierstrass formulated the well-known e — 5 definitions of 
limit and continuity, definitions that completely ignore nonstandard numbers, 
the dispute regarding infinitesimals quickly settled in their disadvantage, but 
only temporarily, for in 1961 Robinson [6,7] presented a mathematically sound 
theory of the nonstandard numbers. These works embody the first fairly complete 
analysis of the nonstandard numbers. Not only are they based on work of fore- 
runners, but also on an amount of mathematical logic that hitherto was unusual 
in mathematics. Only a few references should suffice here, see [8-12]. 

Robinson starts from the axioms of set theory due to Zermelo and Fraenkel, and 
the axiom of choice (called together the ZFC axioms), derives R in a classical 
kind of way, and then extends R to *R by applying a rather considerable amount 
of mathematical logic, as indicated before. 

Another way to define *R was already indicated by Hewitt [10] and worked out by 
Luxemburg [13] . Here the ZFC axioms are again the point of departure, but the 
more usual line of mathematical thinking is followed. (Except for the ZF axioms, 
this way is also followed in the next chapter.) 

Still another way to introduce *R was found by Nelson [14]. Nelson adds three 
more axioms to the ZFC axioms, as well as a new symbol, st (for 'standard') that 
is used as a kind of label to distinguish standard constants from nonstandard 
constants. This leads directly to the set of all standard as well as all nonstandard 
constants, without the intermediary step of first introducing R; consequently 
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in internal set theory *R is denoted by R, and similarly, *1N is denoted by IN, 
etc. Actually, the point of view of internal set theory is that the IN of classical 
mathematics is the same as the IN of nonstandard analysis; and that all that 
happens is that unexpected elements of IN are discovered, elements that had 
always been there. In other words, according to this point of view, 0, 1, 2, etc. 
do not at all fill up IN (see Robert [15] and F. Diener et G. Reeb [16]). The 
additional axioms make sure that transfer is guaranteed (axiom of 'transfer'), that 
nonstandard numbers exist (axiom of 'idealization'), and that unique standard 
sets can be derived from given sets (axiom of 'standardization'). Even though 
internal set theory uses relatively little of mathematical logic, the new axioms 
require some study, and do not seem to be as obvious as, for example, the axioms 
of Greek geometry: 

Transfer: V st ti . . . V st 4 : [V st x : P(x, t u . . . , t k ) : P(x, t u . . . , t k )}. 

Idealization: [V st ^ n x : 3x : E z : P(x,y)] -> [3x : V st y : P(x,y)]. 

Standardization: V st a; : 3 st y :\/ st z : [zey^zexA P{z)\. 

Here u means that the variable u must be standard, and similarly the label fin 
means that the corresponding variable must be finite (but beware, in internal set 
theory any hyperlarge natural number is finite, only the combination of standard 
and finite amounts to the classical notion of finiteness). Note that whereas s ^u 
means that the variable u is standard, *A means the variable *A is standard, 
because st is a label but * is a mapping. P{. . .) denotes a given internal statement, 
except in the last axiom, where P(. . .) may even be external (see Section 1.6). 

In naive nonstandard analysis these three additional axioms are not assumed 
but derived from the existence of the natural numbers and the axiom of choice. 
Transfer has already been discussed; and idealization is used to prove the existence 
of nonstandard elements in any internal set with an infinite number of elements. 
Perhaps standardization is the most intriguing of the three because it contains a 
statement P(z) that may be external. Reformulated naively it means that, 

V*x : 3*y : Yz : [*z e*y^*ze*xA P(*z)}, 

where x, y and z are, of course, classical. Since always *s e *S if and only if s e S, 
it follows that, 

y = {z E x : P(*z)}, 

or equivalently, 



*y = *{zex :P(*x)}, 
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which in internal set theory are illegal set formations. Here are a few examples, 
where x and y are still classical, but z need not be classical. 

1) P(z) = z e *N A z is standard; 

then x = {1, 2, 3} gives y — x — 1, 2, 3}, 

x — INT gives y — x — INT, 
and x = R also gives y = INT. 

In fact IN is the largest y that is possible for variable x. 

2) P(z) = z E *N A z < n, with n G *N given such that n ~ oo; 
then the results are as under 1). 

3) P(z) = z e *R A z ~ 0; 

then y = {0} if e x and y = if ^ x. 

For other details the reader should consult more adequate treatments of internal 
set theory. 

In the mean time other versions of nonstandard analysis have been developed. 
In one of them external sets are 'legalized' by means of still other axioms, and 
another label, ext (for 'external'). 

By now many hundreds of publications have been devoted to nonstandard anal- 
ysis: it is an established branch of mathematics. 

No matter how infinitesimals are introduced, with or without the axioms of set 
theory, with or without extra axioms and new undefined symbols (st and ext), al- 
ways the axiom of choice seems indispensible. If one tries to develop infinitesimal 
calculus without this axiom, it seems that one should be satisfied with a mutilated 
theory, as will be explained later on in Section 4.4. Here attempts by Chwistek 
[17,18] in this direction should be mentioned. In his 1926 paper Chwistek intro- 
duces new numbers by means of infinite sequences of classical numbers. These 
new numbers are called Progressionszahlen ('sequence numbers'), and equality 
for them is defined as follows. Let A^ctj) and iVj(/3j) be two new numbers, then, 

Ni(cti) = Ni(Pi) if and only if ctj = $ for i > n for some n e N. 

Something similar is done to define inequality, and an operation like addition is 
defined by, 

Niian) + N^) = N(a t + pi). 
A classical function / is extended by means of, 



f{Ni{ ai )) = Ni(f( ai )). 
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The extended function happens to be quite similar to */, the *-transform of /. 
Even so not much new calculus is developed. An extension of R that includes all 
sequence numbers could be introduced, however. 

In his 1948 book Chwistek spends less then ten pages on the subject, but nev- 
ertheless shows that he is well aware of the fact that 'infinitely small' numbers 
can be introduced, and he also introduces internal functions (called normal func- 
tions by him). Again there is no fully expanded calculus. Most likely, the deeper 
reason for this is that Chwistek defines (in)equality for his sequence numbers as 
indicated above. This definition has the advantage that the axiom of choice is not 
needed, but leads to rather serious problems, as will become clear in Section 4.4. 
It remains to remark that working with sequences is a technique used by Hewitt 
[10] and Luxemburg [13], and will be the technique of the next chapter, which is 
based on assumptions that from a naive, intuitive point of view are understand- 
able, obvious, and acceptable, except perhaps the axiom of choice, and where 
everything that is not so obvious, such as transfer and all the rest, will be proved, 
rather than assumed. 



1.10 Introducing infinitesimals by plausible reasoning; 
filters 

Let / be a function from R to R, and let c and b be real numbers. Then in 
today's notation the definition according to Leibniz's ideas that the limit of f(x) 
for x tending to c is equal to b, is as follows, 

Vx, x — c ~ : f(x) — b ~ 0. 

As already mentioned, neither Leibniz nor anybody else at that time gave a math- 
ematical sound definition of infinitesimal, and a dispute started, that temporarily 
stopped in the 1870's when Weierstrass was able to present a mathematically 
sound limit definition, that was completely void of infinitesimal thinking: 

> : 35 > : Vrr, <| x - c |< 5 : | f(x) -b\<e, 

where e and 5 too are real numbers. This definition leaves no room for misunder- 
standing, and is even intuitively clear: after that mr. E. specifies any e > 0, mrs. 
D. is able to specify a 5 > 0, such that if x is within a distance 8 of c (x need 
not be equal to c), then f(x) is within a distance e of b {f{x) may be equal to 
b). Note that mrs. D. is able to come up with a 5 > no matter how small e > 
turns out to be; in each case she has a response. 
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Leibniz's definition is more attractive: it contains only one quantifier, against 
three in Weierstrass's definition, and even though the latter is intuitively clear, it 
is not as easy to grasp, not even for beginning mathematics students (probably 
for many other students it never becomes entirely clear). 

Let us now see how by means of plausible reasoning we can find a good definition 
of infinitesimal. First note that the e — S definition is of a dynamical nature. If 
mr. E. first presents e — Si, after which mrs. D. presents S — Si, he may present 
another e = e 2 that is so small that mrs. D. is forced to present another, i.e. 
smaller 5 = 5 2 (excluding trivial cases where / is constant in a neighborhood of 
c). Repeating this again and again, infinite sequences (e 1 ,e 2 , ■ ■ ■) and (Si, S 2 , ■ ■ ■) 
are the result, and mr. E. and mrs. D. are involved in some dynamical process. 
Obviously, Si > e 2 > ■ ■ ., and Si > S 2 > . . . and it is clear that the e's as well 
as the <5's tend to zero (again excluding trivial cases). Instead of considering the 
sequences (si,£ 2 , ■ ■ .) and (Si, S 2 , . . .), we can just as well consider the sequences 
(f(xi),f (x 2 ),...) and (xi,x 2 , . . .) and require that if x« — c tends to for % tending 
to infinity, but such that Xj — c 7^ for all %, then f(xj) — b too should tend to 
for i tending to infinity. Let us abbreviate this to: 

( Xi _ c ) _ =}► (/(xO - 6) -> 0, 

where (xi — c) and (f(xi) — b) stand for the sequences (xi — c,x 2 — c, . . .) and 
(f(xi) — b, f(x 2 ) — b, . . .), respectively. Now it so happens that the e — 5 definition 
is equivalent to, 

V(xj), (Xi - c) -> 0, Xi - c ^ : (f(xi) - b) -> 0. 
Exercise: Show this. 

This formulation contains only one quantifier, and in form comes quite close to 
Leibniz's definition. However, we have traded two quantifiers for two converg- 
ing sequences. This is overcome by replacing sequences that converge to by 
infinitesimals, or rather, by letting a sequence that converges to generate an 
infinitesimal, which will have the effect that the dynamical process mentioned 
above is replaced by something static. 

So, let each infinite sequence (si, s 2 , . . .) of real numbers Sj that converges to 
generate an infinitesimal, and let this number be indicated by H (si, s 2 , . . .) or 
simply by H(si), (with the H of Hyper). 

As an example, let = then H(sj) is an infinitesimal, and since 1/i > for all 
i it should be positive. It is also reasonable to require that H(0, 0, . . .) = H(0) = 0. 
Moreover, it should be possible to treat infinitesimals and other nonstandard 
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numbers as ordinary numbers. This requires that, for example, addition, subtrac- 
tion and the greater-than relation be defined for them. The natural, but partly 
wrong, guesses (that are almost identical to Chwistek's definitions mentioned in 
Section 1.9) are, 

addition: H(si) + Hfc) = Hfa + U), 
subtraction: H(si) — H(ti) = H(si — U), 
greater than: H(si) > H(ti) if Sj > U for all ieE. 

Note that so far everything (being positive, being zero, addition, subtraction, 
greater than) is defined term by term. Let us adopt this as a basic rule. Unfor- 
tunately, this leads to trouble. For let, 

(si) = (1/1,3/8,1/4,3/32,1/16,3/128,...), and 
(U) = (7/8,7/16,7/32,7/64,7/128,7/256,...). 

Then both (sj) and (ti) converge to 0, and the terms of both (si) and (ti) are 
all positive, and nicely decrease with increasing i. So H(si) > 0, H(ti) > 0, 
H(si) ~ and H(ti) ~ 0, and there seems to be no reason at all to reject these 
sequences as generating sequences of certain infinitesimals. But now consider the 
difference of H(si) and H(ti), which is generated by the sequence (si — ti), i.e. by, 

(+1/8, -1/16, +1/32, -1/64, +1/128, -1/256, . . .), 

then, as everything should be defined term by term, the conclusion must be that 
H(si — ti) is different from zero, is not positive and is not negative, hence that 
H(si — t^ does not behave as an ordinary number. Where did we go wrong? 
The example given would not give trouble if the definition of greater-than were 
revised: 

H(si) > H(ti) if Si > ti for all even i, 

hence if only the even indices were 'accepted' and the odd indices were 'rejected'. 
Then H(si — ti) would be negative. Not only would this revision be rather arbi- 
trary, and could, for example 'even' have been replaced by 'odd', so that the odd 
indices would be accepted and the even indices would be rejected, it would not 
work either, as a new example could be concocted resulting in a hypernumber 
that was neither positive, nor negative, nor zero. The only way out would be 
that given any partitioning of IN into two disjoint subsets of indices i, it would be 
allowed to accept one of these subsets (that is to say that all of its elements would 
be accepted) if it were infinite, and reject the other, and given any partitioning 
of any accepted subset into two disjoint subsubsets, it would again be allowed to 
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accept one of the latter if it were infinite and reject the other, such that any sub- 
set of a rejected subset is itself rejected, etc., etc., ad infinitum. That accepted 
subsets would have to be infinite is a reasonable requirement as infinitesimals 
should be generated by certain infinite sequences. 

Even then the definition of greater-than would be rather arbitrary, but for all 
sequences (sj), H(si) would always be either positive, or negative, or zero. To 
see this, let Qo = {i : Si > 0}. Then if Qo is accepted, hence Q\ = {i : Si < 0} 
is rejected, H(si) is positive. On the other hand, if Qo is rejected, so that Qi is 
accepted, then H(si) is nonpositive. In the latter case, let Q±o = {i : Sj < 0}. 
Then if Q w is accepted, hence Qu = {i : Sj = 0} is rejected, H(si) is negative, 
and if Qiq is rejected, so that Qu is accepted, then H(si) is zero. Note that 
Qo and Qi are two complementary subsets of IN and that Q w and Qu are two 
complementary subsets of Q\. In this example it has tacitly been assumed that all 
the Q's involved are infinite sets. If this is not so fewer cases need be examined. 

Note that a revision of the definitions for addition and subtraction would not be 
necessary, because although the attention could be restricted to some accepted 
subsets of indices i, there is no reason to do so. 

Still there is some trouble. There is strong evidence that if we are fully free to 
accept or reject subsets of indices, but such that the given rules are obeyed (IN 
is accepted; accepted sets must be infinite; of any two disjoint sets whose union 
is equal to an accepted set, precisely one must be accepted; subsets of rejected 
subsets must be rejected), we will never be able to give a complete specification 
of our choices or preferences. The meaning of this is that most likely we will never 
be able to write a computer program that contains all our preferences, and that, 
once the program has completely been finished and has been read in the relevant 
computer, it outputs either 'accepted' or 'rejected', after any subset Q of indices 
is presented as input to it. 

Exercise: If this seems unbelievable, just try to give such a complete specification 
in the example above. See Section 1.16. 

Although it seems not possible to specify all preferences in a constructive kind of 
way, it can be shown that a 'complete system of preferences' exists if the axiom 
of choice is invoked. The more technical term for such a system is free ultrafilter. 
But there are many such filters and when starting from one of them H(si — ti) 
may turn out to be positive, whereas when starting from another one H(si — ti) 
may turn out to be negative, because in the former case all odd indices and in 
the latter case all even indices would form an accepted set. We shall have to live 
with this arbitrariness, however. On the other hand, in practice only one free 
ultrafilter is needed. It is selected once and for all, and although there is much 
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arbitrariness in its selection, after the selection for each H(si) it follows uniquely 
whether it is zero, or positive or negative, and similarly for all other choices 
with a finite number of alternatives that will present themselves. Ironically, we 
will never know the 'selected' filter completely, or rather we will only know it 
extremely incompletely. See Sections 1.15 and 1.16. 



1.11 Basic assumptions of formalism 

The two main schools of mathematical thinking are formalism and constructivism. 
They will be reviewed in the present and the next sections. 

In formalism, which is the predominant school, basic assumptions are the ZF ax- 
ioms of set theory due to Zermelo and Fraenkel (or equivalent variations thereof). 
In them each symbol that represents a variable, represents a set, and in each of 
them the symbol G occurs. Hence there is no reference at all to numbers, or ge- 
ometrical concepts, or whatever. Instead of the word 'set' and the symbol 'G' a 
word and a symbol not yet existing could be used, because the axioms only have 
a formal meaning. Nevertheless they are intended to fix intuitive ideas regarding 
'things contained in other things', but that is their semantic aspect. Even though 
it is often said that in formalism the notion of set is undefined, and that G is an 
undefined symbol, to a certain, but limited, extent they are defined implicitly by 
the axioms, because after all x G y must be a statement, i.e. something that has 
a truth value (but recall that logic too can be formalized). 

Other basic assumptions are that from the axioms other truths can be derived 
by applying the well-known rules of logic, such as the syllogisms, the rules of 
substitution, and the principle of the excluded third (or middle). 

In formalism logic prevails over mathematics, that is to say that mathematics is 
subject to all rules of logic. In constructivism the order is reversed, which has the 
consequence that each rule of logic has to be screened before it can be accepted 
as a rule for mathematical reasoning. This has resulted in the rejection of just 
one rule of logic, namely the principle of the excluded third. 

Here are the first seven of the ZF axioms (see, for example, C.C. Chang and H.J. 
Keisler [19], 

1. Vx, y : (x = y (Vz : (z G x z G y))). 

2. 3x : \/y : {->y G x). 

3. Vx, y : 3z : Vu : (u G z (u = x V u = y)). 
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4. \/x : 3y : : G y (3w : (z E w A w E x))). 

5. Wx : 3y : Wz : (z E y 43- (Vu> : (u> G z =>• u> G x))). 

6. 3a; : ((3y : y G rr) A (Vz : (z G i 4> 3iu : (z G u> A w G x)))). 

7. Wx : (3y :t/6i=5> (3z : (z6iA -i(3iu :icGzA«)G 

The formal character of these axioms may be emphasized by not mentioning their 
intuitive meaning, and by replacing G by any other suitable symbol. Whatever 
the interpretation of set and G, it can be shown that the x whose existence is 
secured in Axiom 2, is unique: for assume that x and x' are such that, 

Wy : (-iy G x) and \/y : {->y G x). 

Then it has to be shown that x = x', or, by Axiom 1, that Wz : (z G x ^ z G 
x'). But for any z it follows that ->z G x and -iz G x', hence for any z, both 
implications z E x ^ z E x' and z G rr' — > -2 G x hold, so that the equivalence 
holds as well. Obviously, the intuitive meaning of the x of Axiom 2 is that of 
the empty set, and apparently the axioms dictate that there is a unique empty 
set, but for the proof the intuitive interpretation of neither set nor G is required, 
which illustrates the formalistic character of formalism. 

Axiom 6, which is the 'infinity axiom', together with other axioms takes care of 
the existence of an infinite set, that, apart from the empty set 0, contains the 
following sets as elements: 

{0}, {0,{0}}, {0,{0},{0,{0}}}, etc. 

Here the set denoted by {0}, is defined by the requirements that G {0} and 
that if x E {0} then x = 0, and similarly for the other sets. 

Only now numbers appear; by definition, 

= 0, 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2} etc. 

In this way the natural numbers, including zero, appear as sets. Once in the 
possession of them the integers, the rationals, the reals, etc. can be defined in the 
well-known ways. 

Most likely, Zermelo and Fraenkel's desire was to formulate a set of axioms that 
would lead to the natural numbers, but nothing more, i.e. to and the elements 
of INT, but in fact the axioms happen to lead to and *N. After additional steps 
the latter leads to Nelson's internal set theory (see Section 1.9). If one insists on 
just IN, there is a way out: take the intersection of all sets that contain 0, 1, 2, 
etc., in other words, take the minimal infinite set whose existence follows from 
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Axiom 6. Altogether a fairly complicated way to get at the natural numbers and 
nothing more. 

Moreover, one wonders whether in order to be able to even write down the ZF 
axioms the natural numbers are not at least implicitly presupposed. For this 
requires symbols to be written down one after the other in a linear order. There 
is a leftmost symbol, with just one right neighbor, for which the same is true, etc., 
until the rightmost symbol is reached (assuming sufficiently long lines). It would 
seem that here the natural numbers are implicitly used for ranking purposes. In 
addition, the required order not only is of a spatial nature, but of a timely nature 
as well, because the symbols must be written down after each other, presumably 
starting from the leftmost symbol, then proceeding with its right neighbor, etc. 
But here space and time are used as physical notions. 



1.12 Basic assumptions of constructivism 

Constructivism occurs in various forms, among which intuitionism and the the- 
ory of recursive functions, but only what they have in common will be briefly 
reviewed here. One of the starting points of constructivism is that before really 
starting mathematics the natural numbers, 1,2,3,..., are already given to us, 
simply because of our ability to count, and to do so indefinitely. This makes the 
'infinity axiom' (Axiom 6 of the axioms listed in the previous section) superfluous, 
and has the agreeable consequence that the natural numbers are not sets. Never- 
theless, there is no real difference between formalism and constructivism as far as 
the natural numbers in ordinary mathematics are concerned, since in formalistic 
mathematics the fact that the natural numbers are sets is largely ignored. 

The same can be said about the rationals, but there is an essential difference 
as far as the reals are concerned, which is a consequence of the fact that the 
rationals form a countable set, but the reals do not. Yet the real numbers of both 
schools correspond to each other in a one-to-one kind of way. Differences appear 
if continua play a part. For example, according to both schools there exists a 
function / defined on the interval [0, 1] of all rational numbers such that for all 
x G [0, 1], either f(x) = 5 or f(x) = 7, and such that both values are assumed 
somewhere. But as soon as this interval is replaced by the interval [0, 1] of all 
real numbers, this is only true within formalism, because within constructivism 
such an / would not be accepted as a function, because it could not be defined 
constructively. In fact, within constructivism all functions defined on a finite real 
closed interval are continuous. For more details, see, for example, M.J. Beeson 
[20], or E. Bishop and D. Bridges [21]. 
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More generally, a basic assumption (or perhaps restriction) in constructivism is 
that everything (definitions, proofs, etc.) must be of a constructive nature. For 
example, in formalism the following is a proper definition of p, 

p is the largest prime number such that p— 2 too is a prime number, 
or if there is no such largest prime number, p — 1. 

But in constructivism it is not, because it is not (yet) known whether or not the 
number of so-called twin primes is finite or not, hence (up to today) the definition 
is not constructive and must be abolished. 

Another example is the 4-color problem. Let q be the minimum number of colors 
that is required to color any given map of countries, such that each country 
receives just one color, and neighboring countries receive different colors. Before 
1977 q was not well defined within constructivism, because one did not know 
whether q was 4 or 5, but after 1977 q is also well defined in constructivism, 
because in 1977 Appel and Haken showed constructively that q — 4. 

A consequence of constructivism is that it puts a restriction on the use of the 
logical principle of the excluded third. Consider some binary relation R, a state- 
ment, 

Vi6R: xRy, (1.1) 

and its converse, 

3x e R : ->xRy. 

Then within formalism either (1.1) or (1.2), because of the principle of the ex- 
cluded third. But within constructivism there are three possibilities: either (1.1) 
can be shown constructively, or (1.2) can be shown constructively, or neither (1.1) 
nor (1.2) can be shown constructively. The third possibility arises because a spe- 
cial kind of proof is required. It follows that within constructivism mathematics 
prevails over logic: the principles of logic must first be checked before they can 
be accepted as tools in mathematical reasoning. In fact, the principle of excluded 
third is the only logical principle that is rejected. 

There is some flexibility in constructivistic thinking, though, as can be seen from 
the next example, taken from E.W. Beth [22], where it is required to show con- 
structively that, 

li + 2 2 + ... + 99" 

is either dividable by 7 or not. Then all that is to be done is evaluate the indicated 
sum, divide it by 7 and see whether the remainder is zero or not. But the effort 
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involved is considerable, even if the computer is used (if the example is not 
impressive enough, replace 99" by 99 9 999 ), but at this point the constructivist 
declares that the proposed procedure is a constructive one, because the division 
could be carried out! 

The case is similar to the construction of a regular polygon with 65,537 (= 1+2 16 ) 
sides by means of ruler and compass (and paper and pencil) only, except that 
there seems to have been a mathematician who spent 10 years of his life checking 
this. He truly was a constructivist. 

From the point of view of constructivism many statements of formalism are incor- 
rect or even nonsensical. Yet, from the same point of view it seems impossible ever 
to derive a contradiction from any statement proved within formalism, because 
it would seem that the required proof would have to be based on the principle 
of the excluded third (see, for example, E.W. Beth [23]): constructivism nobly 
denies itself the indispensable weapon that it would need to defeat its enemy. 

Although constructivism may seem a restricted kind of mathematics, it is sound 
mathematics and its achievements are remarkable. Nonstandard analysis would 
be impossible within it, however, because then the axiom of choice would be 
required, but, as we will see, this axiom is of a highly nonconstructive nature. 



1.13 Selecting basic assumptions naively 

From a naive, common sense point of view, both, formalism and constructivism, 
have agreeable as well as disagreeable characteristics. Fortunately, we are not 
bound to choose between the two. In this section a number of basic assumptions 
will be presented that combine the agreeable aspects of both schools, and that 
at the same time are easily understandable, and fairly obvious, perhaps with the 
exception of the axiom of choice, but this axiom is dictated because of the subject 
of this book. The net effect of this combining starting points of both schools will 
be that the ensuing mathematics is identical to the mathematics of the formalistic 
school. It is this combination that will serve as the basis of the theory of the next 
chapters. 

Basic assumptions regarding logic. Assume that mathematics is subject to the 
principles of logic, in particular to the principle of the excluded third, which holds 
that any statement is either true or false; there is no third possibility. 

Basic assumptions regarding the natural numbers. Assume that when starting 
mathematics the natural numbers, including 0, are given to us, and that they can 
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be used to count. Here is the natural number that is used if there is nothing to 
count. 

In formalistic mathematics the natural numbers are regarded as sets, but below 
this will not be done. The natural number is the first kind of variable that will 
be considered. As usual a variable may assume certain values, each value is a 
constant. 

From the present basic assumption it follows that it is legitimate to use the 
well-known inductive proof. An inductive proof has the following structure. 

Suppose, we are concerned with infinitely many mathematical state- 
ments that can be counted by means of the natural numbers 0, 1, 2, 

Suppose further that P(0) is the first statement to be counted, 

that -P(l) is the second one, etc. Assume that P(0) is a true state- 
ment. Also assume that given any natural number n if P(n) would 
be true then P(n') would be true, where n' is the natural number 
that comes immediately after n. Then P{m) is true for any natural 
number m. 

Indeed, as -P(O) is true, it follows that P(l) is true (taking n = and hence 
n' = 1), but since P(l) is true it follows that P(2) is true (taking n — 1 and 
hence n' — 2), etc. Hence, when counting the given statements they can at the 
same time be proven to be true. Since by assumption all statements can be 
counted, they are all true; which shows the validity of the inductive proof. 

For interesting arguments in favour of the present basic assumption, the reader 
may consult A. Heyting [24]. 

Basic assumptions regarding sets. The main difference between the assumptions 
that follow below and the ZF axioms of set theory, is that in the latter there are 
sets that have to be filled in with elements, or with no elements at all, whereas 
below there are elements (for example, the natural numbers) to begin with that 
may be taken together to form sets. As a consequence, in formalism any constant 
is a set, whereas below there are numbers, sets, n-tuples (among them pairs 
and triples), and functions (among them sequences), that each have there own 
intuitively determined character, although it should be admitted that the intuitive 
meaning of the various types of constants is not strictly required when deducing 
theorems from the axioms. 

Existence (or specification). Given any moment in time, assume the possibility 
to either label or not label each of the then available constants separately, and 
to consider the set of precisely all unlabeled constants. A suitable label would 
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be: 'no'. The unlabeled constants are called the elements of the set. As usual the 
relationship between a set and its elements is indicated by means of the symbol 
G. 

Labeling all available constants leads to the existence of the empty set, indicated 
by0. 

A well-known way to form sets is by a mathematical statement P(x) depending 
on a single free variable x; x is labeled if and only if P(x) does not hold. 

Equality (or extensionality) . Two sets X and Y are equal if and only if they have 
the same elements, or, more formally, if x G X implies that x G F and vice versa. 
It follows that the empty set is unique. 

Assume that any set is a constant, and hence that variables can be introduced 
whose values are sets, as well as sets whose elements are themselves sets, etc., 
but with the limitation that given any set 'taking an element of can be plied a 
finite number of times only. Let the maximum number of times this can be done 
be called the level of the set, so that the lowest possible level is 1. As a natural 
number is not itself a set it is called an urelement or individual. In general, an 
urelement is any constant that is not a set. One might say that urelements are 
at level 0. 

Without much loss of generality it may be assumed that all sets are regular. By 
definition a set of level k is called regular if all its elements are urelements (so 
that k = 1), or if k > 2 and all its elements are regular of level k — 1. Hence 
{x, {x,y}} is not regular, but {{x}, {x,y}} is if x and y both are urelements, or 
both are regular. 

It follows that subsets and the power set of a given set can be formed, as well as 
the complement of a subset, and the difference, the union and the intersection of 
two sets. The notations are as follows. 

Sets: {x : P(x)} (i.e. the set of all x such that P(x)), or {x G 
X : . . .} (i.e. the set of all x such that x G X and such that 
. . .), or {a, b, c, d} (i.e. the set with the elements a, b, c, and d), or 
{ai, a 2 , a 3 , . . .} (i.e. the set with the elements a n for each natural 
number n), etc. 

The subset and superset relations: C and D. 

The proper subset and proper superset relations: C and D. 

The power set of a set X : V(X) or 2 X . 

The complement of a subset X : X c . 

The difference, the union, and the intersection operations: — , U and n. 
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From these assumptions it follows that a notion like the set of all sets is nonsen- 
sical. This notion plays a role in one of the many paradoxes that were found after 
the first, somewhat careless, set-up of the theory of sets. The argument runs as 
follows. If there were a set X of all sets, consider V(X), then V(X) would be an 
element of X, as X is the set of all sets, but V(X) cannot be an element of X, 
contradiction. This paradox arises because the interference of time is ignored: if 
today the set X of all sets that are known sofar is introduced, only tomorrow 
V(X) can be introduced. The existence of this and other paradoxes has been an 
impetus to the development of both formalism and constructivism. 

As usual, the symbol N is reserved for the set, 

{n : n is a natural number different from 0}, 

or, simply, 

N = {1,2,3,...}. 

Similarly, 

N° = {0,1,2,...}. 

Basic assumptions regarding n-tuples. Apart from natural numbers and sets also 
pairs, triples, and in general n-tuples are taken as primitive notions. The ordered 
pair or simply pair of two constants x and y is denoted by (x,y). The basic 
property of pairs is that (x,y) = (x',y') if and only if x — x' and y = y'. Often, 
in particular within formalism, (x,y) is defined as the following set, 

P = {{x}, {x,y}}, 

a definition that is due to Kuratowski. But there are many variations on this 
theme. One of the alternatives is to let (x,y) be the set, 

p' = {{{*}, 0}, {{y}}}- 

It is not difficult to show that both (1.7) and (1.8) satisfy the basic property 
of pairs, but clearly P ^ P'. Why prefer (1.7) over (1.8)? In fact definitions of 
(x, y) as sets do too much. If (x,y) would be equated to (1.7), for example, then 
{(x, x)} = {{{x}}}. This property is entirely accidental. See M.D. Potter [25], 
from which alternative (1.8) has been taken, for interesting details. For these 
reasons we follow Bourbaki's decision to regard the pair as a primitive term. 

The basic property of the triple (x,y,z) is, of course, that (x,y,z) = (x',y',z') 
if and only if x — x' and y — y' and z — z'. Instead of regarding the triple as a 
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primitive notion, it is often defined as the pair (x, (y,z)), although the obvious 
alternative is {(x,y),z). Combining the first choice with Kuratowski's definition 
the result would be that 

(x,y,z) = {{x}, {x, {{y}, {y,z}}}}, 

but why not let, 

(x,y,z) = {{x}, {x,y}, {x,y,z}}7 

Similar remarks can be made with respect to the n-tuple (xi, . . . ,x n ), given any 
n G IN, n > 4. 

For any n > 1, Xj is the j-th term of the n-tuple (x±, . . . , x n ), j — 1, . . . , n. 

Basic assumptions regarding generating new constants. Given any constant, as- 
sume the possibility to let a, possibly new, constant be generated by this constant, 
subject to the following two rules: 

1) Identification. If the generated constant is not to be new it must be iden- 
tified explicitly with a constant already known. 

2) Equality. Equality of constants must be defined in such a way that all 
constants, new and old, satisfy the well-known rules of equivalence: x — x, 

x = y =>- y = x, and x = yAy = z^-x = z. 

The character of a new constant is determined by these two rules, as well as by 
its intuitive interpretation. Anyway, a new constant is regarded as an urelement. 

The axiom of choice. Assume that the axiom of choice holds true, i.e. that given 
any infinite set, whose elements are nonempty sets, it is possible to select an 
element from each of these elements. 

As will be illustrated in Section 1.16, this axiom is of a highly nonconstructive 
character, but it will only be used to establish the existence of free ultrafilters. 



1.14 Basic definitions 

A) Functions. Given two nonempty sets X and Y, a function (or mapping 
or map) f : X — > Y is generated by a set G with elements of the form 
(x, y), x G X, y € Y, such that for all x G X there is a unique y dY with 
(x, y) G G. G is called the graph of the function /. Functions are regarded 
as new constants so that they are urelements, and two functions are equal 



if and only if their generating sets are equal. The intuitive meaning of a 
function is that of an assignment: a function assigns a certain y to a given 
x. As usual the relationship between x and y is written as y = f(x), or as 
x I— > y, or as x i— > In case X is a subset of IN, and n £ X, /(n) is 

often written as / n . Also the definitions of function value, domain, range, 
injective (or one-to-one), surjective (or onto), and bijective (or one-to-one 
onto) functions, as well as the inverse of a bijective function are as usual. 

Sequences. If, for n £ N, X = {1, . . . , n} (or X = {0, 1, ... , n}), a function 
/ : X — > F is called a finite sequence; and if X = IN (or X = N°) then 
/ is called an infinite sequence. The fj for j £ X are called the terms of 
the sequence. The usual notation of sequences is: 

...,f n ) or (/i,/ 2 ,...), 
and similarly in case £ X. 

Note that although intuitively there may not be much difference between 

(xi, . . . , x n ) and (x±, . . . , x n ), formally there is. 

Infinite sequences will play a crucial part in what is going to follow. 

Kinds of sets. Given a set X, if there exists a bijection from X onto 
{1, 2, . . . ,n} for some n £ IN, then X is called a finite set, otherwise an 
infinite set. Alternative formulations in these cases are that X contains a 
finite number of elements or an infinite number of elements, respectively. 
If X is an infinite set, and if there exists a bijection from X to IN, then 
X is called a countably infinite or denumerably infinite set, or simply 
countable or denumerable. 

The Cartesian or direct product of two sets S and T is the set of pairs 
{(s,t) : s E S, t e T}. The product is indicated by S x T. A similar 
definition can be given for n sets by means of n-tuples, n > 3. 

Sets representing n-tuples or functions. In what follows it will be con- 
venient to relate n-tuples and functions in a one-to-one way to certain 
sets. The choice of these sets is completely arbitrary, as long as this re- 
quirement is satisfied. (If n-tuples and functions would have been defined 
as sets themselves, this representation would be superfluous!). Given the 
n-tuple (xi, . . . , x n ), let the set be, 

{{xx}, {x!,x 2 }, {x 1 ,...,x n }}, 

and given the function / let the set be its graph G, i.e. the set that 
generates /. Then there exists bijections T n and F such that, 



(xi, . . .,x n ) = T n ({{xi} , {xi,x 2 }, {x 1 , . . . ,x n }}), 
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{{^i}, {X!,X 2 }, {xi,..., x n }} =T n 1 ((x 1 ,...,x n )), 
f = F(G), and G = F~\f). 



1.15 Filters 

In Section 1.10, where infinitesimals were introduced by plausible reasoning, the 
conclusion was reached that a set of 'acceptable' subsets of IN was needed, satis- 
fying the following rules: 

1) IN is accepted, 

2) accepted sets are infinite, 

3) of any two disjoint sets whose union is equal to an accepted set, precisely 
one is accepted, and, 

4) subsets of rejected subsets are rejected. 

Such a set was called a free ultrafilter (over IN). Hence U is a free ultrafilter if: 

1) IN g u, 

2) if Q E U, then Q is infinite, 

3) if Q = Q x U Q 2 E U, QiHQ 2 = 0, then either Q x E U or Q 2 E U, but not 
both, and, 

4) if Q £ U and S C Q, then S&U. 

These requirements are equivalent to the following more usual ones. U is a free 
ultrafilter (over IN) if: 

la) IN e U, 

2a) if Q E U and if R D Q, then ReU, 
3a) if Q G U and if i? e U, then QnReU, 
4a) if Q G C/, then Q is infinite, and, 

5a) if Q C N, then either Q E U or Q c = N - Q E U , but not both. 
Proof of the equivalence: 

A. Let 1) to 4) hold. Then la) and 4a) hold trivially and 5a) follows imme- 
diately from 1) and 3). If Q E U and R D Q, then Q c g U and R c C g c , 
hence R c E" U and R E U, which proves 2a). Finally, if Q, R E U, then 
Q c E~ U, hence Q c n -R G" C/, hence Qn RE U, which proves 3a). 

B. The proof of the converse is left as an exercise. □ 
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The terminology used rightly suggests that there are more general filters (over 
IN). Here are their definitions. 

Any set F of subsets of IN is called a filter (over IN) if: 

lb) NgF, 
2b) £ F, 

3b) if Q G F and if IN D i? D Q, then i? G F, 
4b) if Q G F and if i? G F, then Qn Re F. 

Note that 2b) and 4b) imply if Q G F then Q c G" F, but this is not to say that 
either Q G F or Q c G F. 

A filter is called free if all of its elements are infinite. 

Remark: Another definition is: a filter is called free if the intersection of all of its 
elements is empty. The two definitions are equivalent for an ultrafilter, but not 
for any filter. 

Finally, a filter is called an ultrafilter if for any Q C INT, either Q eU or Q c G U, 
but not both. 

Nonfree filters are not very interesting for our subject, but free filters and free 
ultrafilters are. An example of a free filter (that is not an ultrafilter) is the set 
F° = {Q : [3k G IN : Q D Q k ]} with Q k = {i : % > k}. This filter, which is called 
the Frechet filter, can be used for the definition of converging sequences, and 
hence for that of infinitesimals as suggested by Chwistek (see Section 1.9). For 
nonstandard analysis the really important filters are the free ultrafilters, because 
with a filter like F° only an incomplete kind of nonstandard analysis can be 
developed, as will be shown later on in Section 4.4. 

Filters over a set M different from IN can also be introduced. Then all that is 
needed is to replace IN in the definitions above by M. Below this will only happen 
in the next section, where M is an infinite subset of IN. 

Exercise: Show that if Q = R\ U R2 U . . . U R n , for some n G INT, Ri n Rj — if 
i 7^ j, and if F is an ultrafilter, then Ri G F for precisely one % G {1,2,.. . , n}. 

Exercise: Show that if F is an ultrafilter, and if M G F, then an ultrafilter G 
over M is induced by F and M, if we let G = {Q : Q C M, Q G F}. Also show 
that G = {QHM : Q G F}. 

Exercise: Show that if F is a free ultrafilter, if Q G F and if ^ G Q, i = 1, 2, . . . , n, 
n G IN, then Q - . . . , q n } G F. 
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Most likely, no completely specified examples of free ultrafilters can ever be given. 
Evidence for this statement is given in the next section. But if the axiom of choice 
is invoked their existence can nevertheless be shown. 

Theorem 1.15.1 Free ultrafilters over any infinite set exist. 

Proof: The proof is given in the appendix. □ 

It is only in this proof that the axiom of choice is required in the version of 
nonstandard analysis that is presented in this book, assuming, of course, that 
when nonstandard analysis is used to prove a classical theorem, the axiom is not 
necessary in a classical proof of that theorem. 

Since there are many free ultrafilters, it is necessary to select one. The selection 
is completely arbitrary, or rather the selection must of necessity be extremely 
arbitrary: apart from a tiny little bit we will not know which free ultrafilter we 
are really dealing with, so that, in fact, the term 'selection' is not quite adequate. 
This is a consequence of the highly nonconstructive character of the axiom of 
choice. 

From now on assume that U is some free ultrafilter (over IN ) and that it is fixed 
once for all. 

Since U is not known, it is not known either whether the infinitesimal s = if(+l, 
— 1/2,+ 1/3, —1/4, . . .), that is generated by the infinite sequence (+1, —1/2, 
+ 1/3, —1/4, . . .) is positive or negative. For if Q = {1, 3, 5, . . .} e U then s > 0, 
and if Q c = {2, 4, 6, . . .} e U then s < 0, but we do not know whether or not 
Q is in U . Surprisingly enough, this and similar dichotomies will not hurt at all, 
simply because the final answers are independent from the underlying filter. 



1.16 About the nature of free ultrafilters 

As already mentioned there are many free ultrafilters U (over IN). Actually their 
number is uncountable. Although this fact is not very important for what is going 
to follow, so that the next argument may be skipped, a proof of this statement 
is given here, because it may give some more insight into the nature of free 
ultrafilters, in particular as far as nonconstructability is concerned. 

First of all the next theorem is needed. 
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Theorem 1.16.1 Let M be any infinite subset of N, then there is a free ultra- 
filter U over IN such that M eU. 

Proof: Let D = IN - M, then if Q C IN then Q = Q' U R for unique Q' C M and 
R C D. Let C/' be a free ultrafilter over M, and let C/ be the set of all Q such 
that Q' G £/'. Then U is a free ultrafilter over N, and M £ U. The verification 
that [/ indeed is a free ultrafilter is left as an exercise. □ 

From this theorem it follows that if a free ultrafilter U over IN is wanted, we can 
require beforehand that either Q = {1, 3, 5, . . .} G U or Qi = {2, 4, 6, . . .} G U, 
for simply let M — Q or M — Q\. This is also true for the other Q's that will 
be introduced. If we require beforehand that Qo G U, then we can also require 
beforehand that either Qoo = {1, 5, 9, . . .} G f7 or Qoi = {3, 7, 11, . . .} G U, and 
if we require beforehand that Q\ G U, then we can also require beforehand that 
either Q 10 = {2,6,10,...} G U or Q u = {4,8,12,...} G U. These four cases 
can be split up into eight new cases in total, which in turn can be split up into 
sixteen new cases, and so on. Taking the n-th step, 2 n new Q's are added, each 
with n indices, each of which is or 1. Apparently, we can require beforehand 
that, e.g. Q , Q 00 , Qooo? Qoooo; • • • are & U filter elements, but we can just as well 
replace this sequence of selections by Qo, Qoi, Qow, Qoioi? • • •> an d so on. Clearly, 
if, say Qabcd £ U , the next Q is Q a bcbe with e = or e = 1. It follows that each 
infinite sequence of 0's and l's defines an infinite sequence of selections, and vice 
versa. Since different infinite sequences of 0's and l's thus lead to different filters, 
and since the set of all these sequences is uncountable, it follows that there are 
uncountably many free ultrafilters over IN, as claimed. □ 

Note that selecting just one infinite sequence of zeros and ones would not specify 
U completely. Far from it, because if the sequence is, say, (0, 1,0, 1, . . .), so that 
Qoj Qoii QoiOi Qoioi, • • • all are in U , then each of these Q's can be split up into 
two, four, . . . arbitrary infinite subsets and at each split a choice has to be made 
as far as the membership of U is concerned. 

The conclusion must be that free ultrafilters cannot be completely constructed 
and that this is due to the fact that the axiom of choice cannot be dispensed with. 
Nevertheless, as already indicated in the previous section, classical mathematics 
developed by means of nonstandard analysis can be kept free from this axiom, if 
desired. 



Chapter 2 



Basic theory 



2.1 Reviewing the introduction of Z, Q and R 

A) Integers. Each integer is generated by a pair of natural numbers. If (to, n) 
is such a pair, the integer generated by it is indicated by Z(m, n). Z is the 
set of all Z(m,n). 

Equality. Z{m, n) = Z(p, q) if and only if to > n, p > q and m — n = p — q, 
or to < n, p < q and n — to = q — p. 

Exercise: Show that this equality relation satisfies the rules of equivalence. 
Identification. For all to e N, let Z(m, 0) = m. 

The more usual notation instead of Z(m, n) is, of course, say, q in case 
m > n and then q = m — n, or — q in case m < n and then q = n — m. 

B) Rationals. Each rational is generated by a pair (m, n) of integers with 
n 7^ 0. Let <5(m, n) be the rational generated by (to, n). Q is the set of all 
Q(m, n). 

Equality. Q(m, n) = Q(p, q) if and only if mq = np. 
Exercise: Verify again the rules of equivalence. 
Identification. For all fceZ, let Q(k, 0) = k. 

Again the usual notation is different: | to | / | n | or — | to | / | n | instead 
of Q(m,n), depending on whether mn > or mn < 0, preferably such 
that to and n have no common divisors. 

C) Reals. Each real is generated by a Cauchy sequence of rationals, i.e. an 
infinite sequence (r 1? r 2 , . . .) of rationals r n , such that, 

Vm G IN : 3k G IN : Vn,p G IN, n,p > fc :| r n - r p |< 1/to. 

Let the real generated by (ri,r 2 , . . .) be indicated by R(r 1 ,r 2 , . . .). R is 
the set of all i?(ri, r 2 , . . .). 
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Equality. R(ri, r 2 , . . .) = R(si, s 2 , . . .) if and only if (r±, r 2 , . . .) and (si, s 2 , 
. . .) are concurrent Cauchy sequences of rationals, i.e. if 

Vm G IN : 3k G N : Vn G N, n > fc :| r n - s n \< l/m. 

The verification of the rules of equivalence is now somewhat more involved. 
Identification. For all r gQ, let R(r, r, . . .) = r. 

Once again the usual notation is different, although this time there is no 
simple notational rule as there was for integers and rationals. Examples 
are \/2, n, e, etc., etc. 

In all three cases, A) to C), operations such as + and simple relations such as 
< can easily be defined for the new numbers by means of the corresponding 
operations and simple relations that are known for the terms of the sequences 
generating them. In some, but certainly not all, cases the definitions can be given 
'term by term', such as + and < for the reals: 

i?(ri, r 2 , . . .) + R(si, s 2 , . . .) = R(n + si, r 2 + s 2 , . . .), 

and, 

R(ri,r 2 , ...)< R(s u s 2 , . . .) if n < s 1 , r 2 < s 2 , . . . 

It would be wrong, however, to define < and > for reals in this way! A simple 
counterexample is, 

?? = R(0, 0, . . .) < R(l, 1/2, 1/3, . . .) = 0, or < ?? 

Exercise: Present the right definitions. More generally, present the definitions for 
+,—,*,/, I • I for integers, rationals and reals, where appropriate. 

The success of all this is, of course, that apart from the fact that any two numbers 
of some kind have a sum and a product of the same kind, in addition any two 
integers have a difference that is an integer, any two rationals (one of which is 
nonzero) have a difference and a quotient that are rationals, any two reals (one of 
which is nonzero) have a difference and a quotient that are reals, and any Cauchy 
sequence of reals has a limit that is a real (the proof of the last statement is not 
entirely trivial). 

From their definitions it follows that any integer, any rational and any real is an 
urelement. 
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2.2 Introducing internal constants; definition of equality 

The collection of all constants of nonstandard analysis can be determined in two 
steps, starting from the collection of all classical constants. First of all generate 
constants by means of infinite sequences of classical constants and add them 
to the collection if they are new, i.e. if they are not identified with a classical 
constant. The generation of these constants will be explained in detail below. 
Then add new constants (sets, pairs, triples, . . ., functions) to the collection that 
can be defined by means of what is in the collection so far in the same way as this 
is done in classical mathematics (for example, add the set of all infinitesimals). 

Every constant of nonstandard analysis is either internal or external. The internal 
constants should be regarded as the decent members of the society of nonstandard 
analysis, and the external constants as its outcasts, because the behaviour of the 
former is like that of similar classical constants, but that of the latter may be 
strange, unexpected and counterintuitive, which is why in internal set theory (see 
Section 1.9) by set automatically is meant internal set, and an external set is not 
a set at all. Here we do not go that far: any set is either internal or external, and 
in the latter case still is a set. An example of an external set is IN. 

Every internal constant is either standard or not. Standard constants are char- 
acterized by a close relationship to classical constants, and some of them are 
even identified with the latter. Every internal constant that is not standard and 
every external constant is nonstandard (see the diagram in Section 1.6). The in- 
ternal constants are introduced first, then follow the standard constants and the 
external constants. 

Remark: The terms 'standard', 'internal' and 'external' have been taken from 
Nelson's internal set theory. 

Recall that a free ultrafilter U over IN has been 'selected' once for all, but that 
so far the definition of new constants did not involve U, i.e. so far only classical 
constants have been considered. 

Now let each infinite sequence (si,S2, •••) of classical constants Sj generate an 
internal constant or hyperconstant H(si, S2, ■ ■ •), which will often be abbreviated 
to H(si). After the development of some theory, most of the time internal con- 
stants can be used without that a generating sequence of classical constants is 
needed. This is entirely analogous to the generation of the reals R(ri,r 2 , . . .) by 
means of Cauchy sequences of rationals r«. Most of the time one can use, say, y/2 
without that some Cauchy sequence tending to y/2 is needed. 
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If all Si are numbers, also H(si) will be regarded as a number (as it may not be a 
set in the present theory, it is not so easy to say what a number really is, all that 
can be done is tell who is a number). Similarly, if all Si are sets, also H(Si) will 
be a set; if all Pi are pairs, also H(pi) will be a pair (and similarly for n-tuples), 
and if all fi are functions, also H(fi) will be a function. Hence hypersets will 
be sets, hyperpairs will be pairs (and similarly for n-tuples), and hyperfunctions 
will be functions, in the classical sense of these terms, but hypernumbers will not 
necessarily be classical numbers. 

The introduction of H(si, s 2 , ■ ■ •) requires that the rules of identification and 
equality must be specified. The latter is the simpler of the two: 

Definition of equality of internal constants: 

H(si, s 2 , ■ ■ •) = H(ti, t 2 , . . .) if and only if {i : Sj = U} G U. 

This definition applies no matter whether the s« and ti are urelements or not. It 
implies that it is pointless to let the Sj be of two different kinds, say, numbers 
and pairs of numbers. For example, suppose that s 2 i-i G R, and s 2 i = (pi,qi), 
Pi, qi G R. Then, since either {2i — 1 : % G N} G U or {2i : i G N} G U, we can just 
as well assume that all Sj are reals, or that all s« are pairs of reals, respectively. 
Making the changes such that this is true will change the presentation, but not 
the value of H(si). This follows immediately from the definition of equality given. 
The same argument can be applied if there were not two but three or more kinds 
of elements, as long as the number of kinds is finite. A case with an infinite 
number of kinds is where Sj is an i-tuple of reals. Another case is where Sj is a set 
of level i. In order to avoid difficulties such cases are to be avoided themselves. 
This is quite acceptable, because in practice they would seem to be rather fancy. 



2.3 Identification of internal constants 

Identification of internal numbers 

A) If all Si are numbers, then let, 

H(si, s 2 , . . .) — s if and only if {i : = s} G U. 

Hence if (3, 3,3,.. .) is identified with 3, but H(\, 2,3,.. .) is new as it is 
not identified with any classical number. 
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Identification of internal sets 

The identification of internal sets is based on the following result. 

Theorem 2.3.1 Given infinite sequences (S\, S 2 , ■ ■ ■) and (Ti,T 2 , . . .) of sets Si 
and Tj, 

{H( Si ) : Sl G Si} = {Hit,) : U G Tj} if and only if {i : $ = T} G 17. 

Proof: Let 5 = {#(sj) : Sj G S'j}, T = {if(ti) : tj G T} and Q = {i : S t = Tj}. 

If Q G C/, take i7(sj) G 5 arbitrarily, then for all i, Sj G Sj, and for i G Q, Sj G Tj, 
hence i7(sj) G T, because of the definition of equality, which means that S C T 
and similarly it follows that T C 5, so that S = T. 

Conversely, if Q G" U then {i : Si ^ Tj} G U, hence either 

i? = {i : Sj G" T for some Sj G Sj} G £/ or, 
{? : tj G" Sj for some tj G Tj} G £7, 

or both. Suppose R E U (the other case is similar). If % G R take Sj G Sj but such 
that Sj G" T, and if % G" R take Sj G S'j arbitrary. Then {i : Sj G" Tj} D i?, so that 
{« : Sj G" T} G C/ and i7(sj) ^ T, because otherwise 

= {i : Sj g Tj} n {i : Sj G T} G C/, 

a contradiction. Hence, indeed, i7(sj) G" T and S ^ T. □ 

Comparing this result with the definition of equality, it follows that H(Si) = 
T/(T) if and only if S — T, where Si, Tj, S and T are as in the theorem and its 
proof. This suggests the following definition. 

B) If all S'j are sets, let, 

H(S 1 ,S 2 ,...) = {H(s i ):s i eS i }, 

which obviously is a (classical) set, although its elements need not be 
classical. 

Proof of the rules of equivalence for equality for internal numbers and internal 

sets 

(1) H(si) = H(si) because {i : s { = s { } = IN G U. 
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(2) If H(si) = H(ti) then H(U) = H(si), because {i : Sj = t«} = {? : ^ = Sj}. 

(3) f (n) — -^( s i) an d -ff(sj) = H(ti), so that P = {« : r\ = Sj} G U and 
Q = : Si = ti} G U, then R = {i : r\ = tj} G C/, because R D P D Q, 
hence //(r^) = □ 

The next result takes care of some simple cases. 
Theorem 2.3.2 If for all i, Si = {s^, then, 

H({ Sl }, {s 2 }, ...) = {H( Sl , s 2 , ...)}, or H({ Sl }) = {H( Si )}. 
Also, if for all i, Si = {si, s'j} , then, 

H({s t ,s' i }) = {H(s l ),H(s' l )}, 
and in general, if for some n G IN and all i, Si = {sn, . . . , Sj„}, then, 
H({s a , s in }) = {H(s a ), H(s in )}. 



Proof: Left as an exercise. □ 

In shorthand, this result may be written as: H({. . .}) = {H(. . .)}, i.e. the 'op- 
erator' H may be interchanged with set formation if all the terms Si of the 
generating sequence are sets with n elements, n G IN. If all Si are infinite, the 
theorem is wrong. As a counterexample, let Si = {sn, Sj2, . . .} = {1, 2, . . .} = IN 
for all i. Then H(Si) contains if(l, 2, 3, . . .) = H{i), which is not contained in 
{H( Sil ), H(s i2 ), ...} = {H(l), H(2), . . .} = {1, 2, . . .} = K. 

Exercise: What if Si = {1, . . . , i}? 
Identification of internal n-tuples 

Next consider H(pi) where all Pi = (xj, Hi) are pairs. It would be nice if H((xi,yi)) 
too would be a pair, but is it? Now recall from Section 1.14 that, 

(xi,Vi) =T 2 ({{xi},{xi,yi}}), 

so that applying Theorem 2.3.2 twice it follows that, 



H(T 2 \{ Xl ,y t ))) = H({{xi},{xi,yi}}) = {H({ Xi }), H({x h y t })} = 
{{H(xi)}, {H(xi), H( yi )}} = T^dHixi), H{ yi ))). 
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Therefore, 

T 2 {H(T;\( Xi , yi )))) = (H( Xi ),H( yi )), 

from which it is clear that if T 2 were replaced by the identity map, H((xi,yi)) 
would be the pair {H{x i ),H{y i )). Fortunately, 

H({ Xi , yi )) = if and only if (H(x t ), H( Vi )) = (H(^,H(y^)), 

because, 

a) H((xi,yi)) = H^x'^y'j)) if and only if {i : (x i: yi) = G U, and 

b) (H( Xi ), H(y t )) = (if (xj), H{y[)) if and only if, 
T 2 {H{T^\{ Xi , yi )))) = T 2 (//(T 2 - 1 ((^,yO))), hence if and only if, : 
^2 1 ((xi,y i }) = T 2 1 {{x' i ,y' i ))} G U, hence if and only if, {i : (a*,^) = 

For this reason, H((x i: yi)) is identified with (if(xj), H(yi)). This means that 
the if operator may be interchanged with pairing, and that the internal pair 
or hyperpair H((xi,yi)) is a (classical) pair, albeit with terms that need not be 
classical. 

The same reasoning can be followed for internal or hyper n-tuples. 
Identification of internal functions 

Turning to functions, the next question is whether, given an infinite sequence 
(/ij/2, •••) °f functions /j, H(fi) can be identified with a function, and if so 
which one? Let : X± — > Yj, for certain sets Xj and Yj. Then is generated by 
the set Gj = {(x h fi(xi)) : Xj G Xj}, with /^y^) G Y { . Also = see again 

Section 1.14. Hence, 

H(F-\f t )) = H(Gi) = HdixiJiixi)) : x t G X,}). 

Again interchanging H with set formation, this becomes, 

{HdxiJiixiW-.HixJeHiXi)}, 

and interchanging H with pairing, 

{{Hix&Hifiixi))) : H( Xi ) e H(Xi)}, 

which set clearly generates a function g from H(X,j) to if (Yj), because H(fi(xi)) G 
if(Yi), and it follows that g = F(H(F~ 1 (f i ))). Fortunately, ff(/j) = ff(//) if and 
only if g = g' , with g' = F(H(F(f' i ))), because, 
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a) H(fi) = H(f>) if and only if {i : f< = /?} G U, and 

b) 9 = g' if and only if 

H(F-\fi)) = H{F- 1 (f i )), hence if and only if, 
{i : F~ l (fi) = F-\f()} G U, hence if and only if, 

Therefore, H(fi) is identified with g, so that the internal function or hyperfunction 
H(fi) is a classical function from the internal set H(Xi) to the internal set H(Yj), 
and, 

H(f i )(H(x i )) = H(f i (x i )). 

Summarizing, we have: 

C) For n-tuples: let H((xn, . . . ,x in )) = (H(xn), . . . ,H(x in )). 

D) For functions: given /< : X< -»• 5^, let //(/^) : //(^) -> //(y^), with 

H(MH( Xi )) = H(Mxi)). 

Proof of the rules of equivalence for equality in the case of n-tuples and functions: 
In the case of n-tuples, this follows from the identification for n-tuples and the 
fact that, if n = 2, 

H({ Xi , yi )) = H^y'i)) if and only if (H( Xi ), H( Vi )) = (H(x' t ), H(jfi), 

and in the case of functions, this follows from the identification for pairs and 
functions and the fact that, 

H(fi) = H(fi) if and only if g = g'. 

The details are left as an exercise. □ 

If for each % the function g i maps Wi to X { and the function /j maps Xi to 
Yi, so that the composition /jot/j is well-defined, then the corresponding internal 
composition H(fiogi) is the (classical) composition of H(fi) and H(g i ) ) because, 

HifMHigMHiwi))) = HifiXHigiiwi)) = H^g^))), w t e W t . 
Identification of internal sequences 

Just as sequences are special functions, internal sequences are special internal 
functions. Suppose, given some n G IN, that each /j is a sequence (/j(l), • • • , fi{n)) 
with n terms. (For more clarity, their arguments are not indicated by subscripts, 
but are between parentheses.) Then H(fj) too is a sequence with n terms, because 
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the domain of H(fi) is equal to the domain of each i.e. {1, . . . , n}. Why? The 
j-th term of this sequence is, 

m){j) = m){m) = mm- 

Hence in this case the internal sequence is a classical sequence with n internal 
terms. But if n depends on % this is not true in general. For if the i-th sequence is 
• • • , fi{ni)), then the argument or index of H(fi) takes the form H(ji) = 
H(ji,j2, ■ ■ •) and hence is an internal natural number that need not be a classical 
natural number. Still it is meaningful to speak of the if (jj)-th term of the internal 
sequence, and this term is equal to H(f i )(H(J i )) = H(f i (j i )), ji = 1, ... ,71*. 

Example: Let n* = i and fi(ji) = i + ji, then the term with index if (jj) = 
H(ji,j 2 , ■ ■ ■), ji = • • is H(l + ji,2 + j 2 , . . .). Taking all ji = 1 this gives 
that if (2, 3, . . .) is the first term. The n-th term is found by taking ji = n for 
all % > 7i, and the result is if (?, •••,?, 2n, 2n + 1, . . .), where the n — 1 question 
marks may be replaced by any numbers, as their values are irrelevant to the 
value of H(7, ...,?, 2n, 2n + 1, . . .). The obvious choice leads to H(n + 1, . . . , 2n — 
1, 2n, 2n + 1, . . .). And the term corresponding to index H(i) = H(l, 2, 3, . . .) is 
if(2i)=if(2,4,6,...). 

Suppose now that all fi are infinite sequences. Then the internal sequence is not 
a classical sequence; its argument is H(ji) and the if (ji)-th term is equal to 

//(/,!(//(./,)) //(./•;(./,))../, 1.2 

Example: Let = i 2 + jf, then the term with index if(jj) = H(j 1 ,j 2 , . . .) is 

if (l 2 2 2 + j|, . . .), so that the first term is if (2, 5, 10, . . .) and the second one 
is if (9, 12, 17, . . .). And the term corresponding to index H(i) = if (1, 2, 3, . . .) is 
H(f + i 3 ) = H (2, 12, 36,...). 



2.4 Standard constants; basic results for internal 
constants 

Any constant that is equal to H(s) = if (s, s, . . .) for some classical constant s is 
called standard. This does not mean that it needs to be a constant of 'standard', 
that is to say classical, mathematics. It only means that it is closely related 
to some classical constant, although in a number of cases it is indeed equal to 
such a constant. Obviously, there is a bijection between the set of all classical 
constants and the set of all standard constants, whereby s is mapped to H(s). 
In order to emphasize this special relationship, instead of H(s) = if (s, s, . . .) the 
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usual notation will be *s, and *s is called the ^-transform of s. So, in case H(s) 
is identified with s, it follows that *s = s, so that, for example, *5 = 5, hence 5 
is a standard constant, but a function / : R — > R is not, because */ : *R — > R* 
and *R D R. The strict inclusion here follows from the fact that, for example, 
#(1,2,3,...) G" R and #(1,1/2,1/3,...) G" R, as can easily be shown by an 
indirect argument and using the identification rules. Nevertheless, if x G R, then 
*f(x) = f(x), so that */ is an extension of /. 

Exercise: Show this. 

Obviously, a constant is internal if it is an element of some internal set. Even the 
following is true. 

Theorem 2.4.1 A constant is internal if and only if it is an element of some 
standard set. 

Proof: The if-part is obvious. Conversely, given any internal constant H(si) = 
H(si, s 2 , • • •), let S = {s : s = s, for some i}, then H(si) G *S. □ 

Theorem 2.4.2 Let S he a classical set such that *s — s for each s G S. Then 
*S = S if and only if S is finite. Otherwise S C *S. 

Proof: Since *s = H(s) = s if s G S, *S = {H(S) = H( Si ) : Si G S} D S. If S is 
finite then any if(sj) = s for some s G S, hence *S C S, and if S is infinite then 
there exist s\, s 2 , . . ., all different, so that H(si) G" S. □ 

As an example let S be a set of numbers. 

Corollary 2.4.1 Let S be a classical set, such that each of its elements is a 
classical set of which each element is equal to its *-transform. Then *S = S if 
and only if S is finite and all its elements are finite. A similar result holds if S is 
of any level (see Section 1.13 for a definition). 

Proof: Left as an exercise. □ 

In general, it may happen that S is not contained in *S. Examples are where 
S = {R}, because then *S = {*R}, so that R is not an element of *S, or where 
S is a set of functions from R to R. 

Now let / be some function from a set of numbers to another set of numbers. 
When is */ = /? If this is true then the domains of both functions must be equal 
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and the theorem tells us when this is the case. Conversely, if both domains are 
equal and hence are finite, then if X is their common domain it follows that 
*f(x) = f(x) for all x G X, so that if / is a function from X onto some set Y, 
then Y must be finite, *Y = Y and */ too is a function from X onto Y. This can 
be generalized as follows. 

Corollary 2.4.2 Let f be a function from X onto Y, and assume that X is 
finite, that *x = x for all x G X and *y = y for all y eY. Then X = X , *Y = Y 
and *f = f. □ 



Corollary 2.4.3 Let g be a function from W to X, let f be a function from X 
onto Y, and assume that X is finite, that *x = x for all x G X and that \/ = y 
for all y 6 7. Then, with Wi, w G W, 

*(fog)(H( Wl )) = (fo*g)(H( Wl )) and *(fo 9 )(*w) = (fog)(w), 

even if W is not finite. 



Proof: It follows that X = X , *Y = Y and that */ = /. Since *(fog) = *fo*g, the 
first equality follows immediately, and since *g(*w) =* {g{w)) = g(w) also the 
second one follows quickly. □ 

A number of useful results has been summarized below. 

a) (Empty set) *0 = 0. 

b) (Relations for sets) Given nonempty sets Si, S, Tj and T, 

H(Si) = H(Ti) if and only if {i : S t = Ti} G U, 

H(Si) ± H(Tj) if and only if {i : S t ^ T t } G U, 

H(si) G H(Si) if and only if {i : s { G Si} G U, 

H(Si) C H(Ti) if and only if {i : S t C T«} G U\ 
in particular, 

*S = *T if and only if S = T, 

*S ^ *T if and only if S ^ T, 

*s G *S if and only if s ^ S, 

SC*T and only if S C T, 
and similarly for C, D and D. 

c) (Operations on sets) Given sets Si, S, T t and T, 

H(SiUTi) = H(Si)UH(Ti), 
H(SinTi) = H{Si)r)H(Ti), 
H(Si-Ti) = H(Si)-H(Ti), 
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if Si C Ti then H(Sf) = (H(Si)) c , where taking the complement is 
with respect to T { and H(T^), respectively; 
in particular, 

*(SL)T) = *SU*T, 
*(Sf]T) = *Sf]*T, 
*(S-T) = *S-*T, 

if S C T then *(S°) = (*S) C , where taking the complement is with 
respect to T and *T, respectively. 

d) (Pairs) 

H((s i ,t i )) = (H(s i ),H(t i )); 
in particular, 

*{s,t) = {%*t), 
and similar equalities hold for n- tuples, n — 3, 4, 

e) (Functions) 

H(f i (x i )) = H(f i )(H(x i )); 
in particular, 

(a) *(/(*)) = */(**). 

f) (Composite functions) 

HdfiogMwi)) = (H(f t )oH( gi ))(H( Wl )y, 
in particular, 

" •[(/off)(u;)] = (?ol7)(H 

Most of these relationships are easily shown or even follow directly from defi- 
nitions. As far as a) is concerned, apply the definition of *S with 5 = 0. Then 
it appears that no H(si) can be found, so that *S must empty, hence, *0 = 0. 
And the proof to show under b) that *s e *S if and only if s G S, follows from 
H(si) G H(Si) if and only if {i : Sj G S^} G C/, because this gives that *s G *S f if 
and only if {i : s G S} G C/ and this is true if and only if s G 5. □ 

Remark: Obviously, not only, say, H(fi)(H(xi)) and *f(*x), but also mixtures like 
*f(H(xi)) and H(fi)(*x) are well-defined. 

Remark: It is not true that if t G *S f , then t = *s for some s G 5. As a counterex- 
ample, let S = IN, and let t = if(l, 2, 3, . . .). 

Remark: It has been made clear that the inclusion *S D S does not always hold, 
but it can be 'saved' by introducing a S, defined by 

a S = {*s : s G S}, 

because then trivially a S C *S. Moreover, there is a 'natural' bijection tp from S 
onto a S, defined by ip(s) = *s. a S is sometimes called the standard copy of S, but 
often it is not a standard set. 
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Remark: In another presentation of nonstandard analysis, *S is called the non- 
standard version of S, but *S is still called standard, which leads to the confusing 
conclusion that the nonstandard version of a classical set is standard. In still 
another presentation of nonstandard analysis *S is called the natural extension 
of *S, but as we have seen, S not always is an extension of *S. 

So even the wording of nonstandard analysis is sometimes nonstandard. 



2.5 External constants 

Having defined internal and standard constants, let us now consider external 
constants. As already said before, an external constant is a constant that is not 
internal. 

Theorem 2.5.1 If S is a classical infinite set of numbers, then S is external. 

Proof: First notice that if T is any set of classical numbers, then *T D T, because 
if t G T then *t = H(t) = t. This implies that if T C S, then S n *T = T, for 
if s G S n *T then s = *s e *T, so that s G T, hence S n *T C T. Since also 
T C S n *T, it follows that Sn*T = T. 

Now if 5 would be internal, then, 

5 = H(Si) = {H(.Si) : Sj G Si} for suitable sets S'j. 
Since S is infinite, it contains a countably infinite set, 

T = {t n :ne IN}. 

Let Tj = fl T, then, 

ff(Ti) = H(Si n T) = //(Si) n H{T) = Sf]*T = T, 

so that T too would be internal. Let Q = {i : Tj is infinite}, then Q ^ U, as 
otherwise it is possible to select t- G T, if i G Q, all different, so that, taking t' { 
arbitrarily if % £ Q, t' — i/(ij) ^ T, because {i : t- = t'} contains at most one 
element and hence is not in U. But at the same time t' G T, because H(Tj) = T, 
a contradiction. Therefore, 



Q c = {i : T is finite } G C/. 
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Let, for i G Q c , t k ^ be the element of T, with largest index (if T 4 would be 
{te,t 7 ,t 13 }, then t k ^ would be 13), and let be arbitrary if i G" Q c , then 

H{t k (i),t k (2), . . .) G T, so that H(t k (i),t k (2), . . .) — t k 

for some k G N, hence {« : ^(j) = t^} G £/, or 

R = {i : Tj contains at most /c elements} G £7, 

because of the definition of k(i), so that t k +i G" T; for all i G i?, hence, 

* fc+ i g if (TO = T, 

whereas t k+ i G T, a contradiction. □ 

The theorem implies that, although we do not yet know much about *N, *R, 
etc., we can say already now that IN is an external subset of *N, that R is an 
external subset of *R, etc. 

Another consequence of the theorem is related to power sets. Let, given any 
classical set A, V(A) be the power set of A, defined by, 

V(A) = {S:SCA}, 

or, in words, V(A) is the set of all subsets of A (including A itself and 0). Instead 
of V(A) also the notation 2 A is used. 

Exercise: Show that if A is a set of numbers, then *(V(A)) = V(A) if A is finite. 

Now let A be a set of numbers and let us compare *(V(A)) with V(*A). By 
definition, 

*(V(A)) = {HiS,) : S t G V(A)} = {HiS,) : 5, C A}, 

and 

P(*A) = {T : T C 

If T is any element of *(P(A)), then T = H(Si) for certain ^ C A, hence 

T = : Sl G 5,} C *A, 

so that T G V(*A). It follows that C V{*A), i.e. applying the ^transform 

at a lower level (A's level) gives no less than at a higher level (P(A)'s level). When 
is this inclusion strict? If A is finite, then V(A) is finite too, hence *{V{A)) = 
V{A) = V(*A), so that the inclusion is not strict. If A is infinite, then (by the last 
theorem) A is an external subset T of *A, i.e. V(*A) contains an element that is 
not internal, whereas any element of *iViA)) is internal, and it follows that the 
inclusion is strict. In other words: 
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Theorem 2.5.2 If A is a set of numbers, the inclusion *{V(A)) C V(*A) always 
holds and is strict if and only if A is an inhnite set, and if so, V(*A) contains an 
element that is an external subset of A. □ 

Since in many cases A will be infinite (e.g. if A = IN or A = Rj, one should be 
careful when dealing with V(*A) and prefer to work with *(V(A)) instead, which 
is standard, so that all its elements are internal. This is the deeper reason behind 
the rule advocated from the beginning of this book to formulate statements by 
means of G, not by means of C or C. When applying transfer later on it will 
almost be a must to stick to this rule. 



2.6 The *-transform of operations and expressions 

By operations are meant functions like taking the absolute value, addition, sub- 
traction, multiplication, division, taking the complement, union, intersection, etc. 
And by expressions are meant what results when operations or other functions, or 
compositions thereof are applied to suitable constants or variables. The simplest 
expressions contain only one function, in particular one operation and no other 
functions, for example | x |, or x + y, or S U T\ more complicated expressions 
are, for example, {f(x)+ | y |} • z 2 , or (S U T) n (V U W). (In certain computer 
languages expressions all of whose constants or variables are numbers are called 
arithmetic expressions.) 

Since, when introducing the internal version H(h\, & 2 , • • •) of any classical oper- 
ation &, only one operation is given, i.e. & itself, all &j must be taken equal to &, 
so that only the standard operation *& can be introduced. Since operations are 
functions, the definition of *& is straightforward. For example, standard addition 
in *R becomes *+, defined by, 

H(s i )*+H(t i ) = H(s i + t i ), 

given any H(si) and H(ti) in *R. Since *+ is the only addition for hyperreals, the 
asterisk in *+ may be dropped, because from the context it will always be clear 
whether classical addition is meant or its *-transform *+. It should be kept in 
mind, however, that often the domains of + and *+ are different, so that strictly 
speaking the two operations are different. So the definition becomes, 

H(s i ) + H(t i )=H(s i + t i ). 

Whereas, by definition, internal addition is always standard, an internal sum need 
not, of course, be standard. 
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Standard subtraction, multiplication, division, taking the absolute value etc. are 
defined in exactly the same way, except that the divisor of a division must be 
nonzero. This extra condition can be met in the following two ways. Let H(tj) be 
the divisor, then H(t,j) 7^ if and only if {i : ti 7^ 0} G U. Now either each ti that 
is zero is changed to an arbitrary nonzero number, say, 1, which has no effect on 
the value of if(ij), and the definition becomes, 

H( Sl )/H(t t ) = H{ Si /ti), 

or the ti that are zero are left unchanged, and the definition becomes, 

H(si)/H(ti) = H(ri), with = Sj/ij if U 7^ and r\ arbitrary otherwise. 

With regard to other operations similar refinements can be formulated, if neces- 
sary. 

Exercise: Consider the composition of two functions. 

Another group of operations is formed by the set operations of forming the union 
or the intersection or the difference of sets, or the complement of a set, hence U, 
— , and c . Again only their standard forms can be introduced. But whereas, 
say, addition has no meaning for the hyperreals, unless it is explicitly introduced 
for them, the classical versions of U, fl, — , and c also have a meaning for internal 
sets, so that, for example *U must be distinguished from U, leading to, 

H(Si)*UH(Tj and if (#) U H (Ti), 

where, by definition, 

H(Si)*UH(Ti) = H(Si U Ti), for hypersets H(Si) and ff(T;). 
Fortunately, *U = U, and the same holds for fl, — , and c . 

it would be wrong, however, to assume that any operation on sets would be 
equal to its ^-transform. The notorious exception is power set formation. For, by 
definition, 

(*V)(H(Si)) = H(V(Si)) = H({Ti : Ti C Si}) = {if (Ti) : ff(Tj) C H(S t )}, 
and, 

V(H(Si)) = {T : T C H(Si)}, 

but whereas if (Tj) is internal, T might be external. So the equality *& = & is 
only guaranteed for & = U, fl, — , or c , and V should be avoided even for internal 
sets. 
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The internal forms of more complicated expressions such as {f(x)+ \ y |} • z 2 and 
(S U T) n (VUW), become 

{H(f i )(H(x i ))+ | H( Vi ) |} • (H(z t )) 2 and (if ($) U if (7})) n (if (ty U H(Wi)), 
respectively. Taking everything standard gives, 

{*f(*x)+ | *y |} • (*z) 2 = *[{f(x)+ I y |} • (^) 2 ] and 

(*s u *T) n (*v u w) = *[(SuT)n(i/u w)]. 

2.7 The *-transform of relations and statements; Los' 
theorem; the internal definition principle 

The simplest mathematical relations are the atomic relations, by which are meant 
relations containing neither logical connectives nor quantifiers, hence relations 
such as =, <, G, etc. They can be regarded as functions to the set B = {true, 
false}, where true and false are the Boolean constants, which are urelements. 
By definition, *true = true and * false = false, hence *B = B. (In order to 
avoid confusion equivalence will be indicated by means of =.) An atomic rela- 
tion with n arguments is called n-ary. Many atomic relations are binary. Atomic 
statements result when atomic relations are applied to suitable arguments, which 
must be expressions. The smaller-than relation in R thus leads to the atomic 
statements s < t (in which case R is binary), s < t < u (now R is ternary), 
etc. with s,t,u G R. So if a relation is regarded as a function, the corresponding 
statement should be regarded as a function value. As with operations, only the 
standard form of internal relations is introduced, but whereas some of them are 
new in nonstandard analysis (e.g. <) others are not, namely when the classical 
form is also meaningful for internal constants (e.g. =, G, c). 

From the correspondence between relations and functions it follows, that if R is 
a binary relation such as < or C, *R is defined by, 

H(s, i )*RH(t l ) = H(s l Rt l ), 

for suitable expressions and tj. Here by definition, 

H{s t Rti) = {i ■ SiRti} G U, 

that is to say, 

H(siRti) = TRUE if and only if {i : SiRU = true} G U. 
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Hence *R too is a binary relation. In fact, this is a very special case of Los' 
theorem, to be considered below. In particular, letting = s and t; L = t for all i, 

*[sRt] = *s*R*t = sRt, 

as {i : sRt} G U = si?t, since si?t does not depend on %. This means that *s*R*t 
is equivalent to the classical statement that is obtained by removing all asterisks, 
which is a very simple case of transfer. For n-ary atomic relations the definitions 
are analogous. 

In case R is one of the relations =, 7^, G, C, C, D, D, R has also a meaning for 
internal expressions. Fortunately, these relations are equivalent to their standard 
forms. 

Proof: Consider G. By definition, 

H( Si ) *G H(Si) = H( Si G S t ) = {i : Si G S t } G U = H(s t ) G H(S t ). 

The other cases are left as exercises. □ 

In case R has no meaning within nonstandard analysis, it will usually cause no 
confusion when the asterisk in *R is dropped. Below this is done anyway when 
explicit examples are given, such as the next two. 

Examples: 

1) {H(f t )(H(x t )) + \H{ Vi )\} ■ (H( Zl )) 2 < (H( Si ) + H(U)) ■ H(n) = 
H[{fi(xi)+ \ yi \}-zf< (si + tj-n] 

and this is also equivalent to, 

{i ■ {fi(xi)+ I yi |} • zf < (si + U) ■ n} G U. 

Taking everything standard the results are, 

2) {*f(*x)+ 11/1}- {*zf < (*s + *t) ■ *r = *[{f(x)+ I y |} • z 2 < (s + t) ■ r] 
which is equivalent to, 

{f(x)+\y\}}-z 2 <(s + t)-r, 
because, 

: {f(x)+ I y \}] ■ z 2 < (s + 1) ■ r} G U if and only if {f(x)+ \ y |}] • z 2 < 
(s + t) -r, 

as the latter statement does not depend on i. 

In the first example to the left there are several internal variables (the constant 
2 is even standard), whereas to the right there is only a single internal state- 
ment; and in the second example to the left there are several standard variables, 
whereas to the right there is only a single standard statement, that, moreover, is 



75 



equivalent to the corresponding classical statement. As already indicated before 
this equivalence is an example of transfer, but a very simple one because neither 
logical connectives nor quantifiers occur. Note that in cases like, 

{*/(**)+ 11/1}- (*^) 2 = *[{/(*)+ | y | iz) 2 ) and 

(*s u *T) n (*v u *w) = *[(suT)n(yu w)\, 

considered in the preceding section, it is not always allowed to drop the asterisk 
to the right, because if *s is some classical constant s may be different from s. So 
at this point there is a divergence between expressions and statements. 

An arbitrary statement is composed of a finite number of atomic relations, logical 
connectives (-1, A, V, =^, quantifiers (V and 3), constants, free variables and 
bound variables. First of all, statements with at least one logical connective, but 
without quantifiers will be considered. They will be written as, 

R(P(s, s', s", . . .), Q(t, t', t", . . .), S(u, u', u" ...),...), 

where P, Q, S, . . . are atomic relations, and s, s', s", . . ., t, t', t", . . ., u, u', u", . . . 
are expressions of constants and free variables. In the beginning of this section it 
was found more convenient to write the atomic statement derived from an atomic 
binary relation R as sRs'. This is now written as R(s,s'). In order to simplify 
the notation (s, s', s", . . .), (t, t", . . .), (w, u', u", . . .) will be abbreviated to (s), 
(t), (u), respectively. Regarding R(P(s), Q(t), S(u), . . .) as a function value, the 
corresponding (Boolean) function is, of course, R, which is called a relation. Its 
domain is B x B x B x . . ., and its range is B, where, as before, B = {true, 
false}. 

As an example, let 

R(P(s), Q(t), S{u)) = (P(s) A Q(t)) =► -,(£(«)), 
which, emphasizing that the logical connectives are functions, can be written as, 

((^o(A,^))o(P,Q,S))(s,t,u), 
where by definition R = (=>• o(A, -.)), (P, Q, S) (s, t, u) = (P(s), Q(t), S(u)) and 
(=> o(A,^))(x 1 ,x 2 ,x 3 ) = (xi Ax 2 ) =>• (-1X3). 

Kemarif.-In formal logic expressions, statements, and statements without free vari- 
ables are often called terms, formulae (or predicates), and sentences, respectively. 
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As before, internal relations will only be standard, so that internal statements 
have the form, 

H(R(P( Si ), Q(U), S( Ui ), . . .))) = *R(*P(H( Si )), *Q(H(U)), *S(H( Ui )), . . .). 

Continuing the last example, this gives, 

H(R(P( Sl ), Q(U), S( Ui )) = *R(*P(H( Si )), *Q(H(U)), *S(H(u t ))) = 
(*P(H( Si )) *A *Q(H(U))) * *^(*S(H( Ui ))) = 
((* ^ (*A, *-))o(*P, *Q, *S))(H( Si ), H(ti), H{ Ui )). 

But, whereas the domains of P, Q and S are rather arbitrary their ranges are 
B, as is that of R, so that the range of g = (P, Q, S) is X = B x B x B and as 
*P = B and *(£ x P x B) = B x B x P, by Corollary 2.4.2 with / = (=>• o(A, -.)) 
and Y — B, it follows that, 

*P = P = *((=► oA, -.)) = (* ^o(*A, *-.)) = (=► o(A, -.)). 

Hence, 

H{R(P{ Si ), Q(U), S{ Ui ))) = R(*P(H( Si )), ViH^)), *S(H( Ui ))), 

where the asterisks in *P, *Q, and *S f could be dropped, as it is clear that standard 
forms are meant. This is a less trivial case of Los' theorem. 

Taking Sj = s, t; = t, Ui = u for all i, Corollary 2.4.3 gives that, 

*[R(P(s),Q(t),S(u))] = H(R(P(s),Q(t),S(u))) = R(P(s),Q(t),S(u)) 
that is, 

*[R(P(s), Q(t), S(u))} = R(P(s), Q(t), S(u)), 

hence the standard form of the sample statement is equivalent to its classical 
form, which expresses a less trivial case of transfer. 

Exactly the same kind of argument can be repeated for any other statement 
without quantifiers. 

Finally, let arbitrary statements with at least one quantifier be given. They are 
assumed to be in prenex normal form (see Section 1.3), hence with all logical 
connectives to the right of the quantifiers. Also it is assumed that each bound 
variable occurs to the left of the G relation. From this it follows that in any 
internal statement each bound variable automatically is internal, even if it is not 
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explicitly written as such. To see this observe that if to the right of this G relation 
there is a constant or a free variable, then the bound variable must be internal, 
because that constant or that free variable is assumed to be internal, and if there 
is a bound variable, then the latter must be internal, as can be seen by repeating 
the argument. As an example see the least upper bound theorem below. 

First let just one quantifier be included: 

3x G X : R(P(x,s),...), 

or, 

\/x G X : R(P(x, s), . . .), 

where X is some set and R(P(x, s), . . .) is an arbitrary statement without quan- 
tifiers. Then, given sets X i? 

H(B Xi G X t : R(P(x t , Si ), ...)) = 3H( Xi ) G H(X t ) : R(T(H(x t ), H( Si )), . . .), 

and similarly for V. Note that to the left Xi and to the right H(xi) may be replaced 
by x. To prove this equivalence observe that the statement to the left is equivalent 

to, 

1) Q = {i : [3^ G X t : R(P(x h Si ), ...)]} £ U, 

and that the statement to the right is equivalent to, 

(a*), #(*)),•••), 
for certain x[, hence, as no quantifiers are involved, to, 

ff(i2(P(^, a< ), •••)), 
hence to, 

2) {i:R(P(x' i ,8 i ),...)}eU. 

Now if statement 1) is true, for i G Q take some x\ G Xi such that R(P(xi, Sj), . . .), 
and if i ^ Q take x'j arbitrary. Then, 

{2 : [3^ G X, : R(P( Xl , Si ),.. .)]} C {i : i?(P(^, Si ), • ■ •)}> 

and statement 2) follows. Conversely, if statement 2) is true, then, since, 

{t : [3xi G X, : R(P(x u Si ),.. .)]} 2 {i : s,), . . .)}, 



statement 1) follows. This completes the proof of the equivalence. 
A similar result for V can be shown as follows. 



n 
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-.[if (Va* G X t : R(P( Xi , Si ),.. .))] = #(3^ G X, t : -[i?(P(^, Si ), . . .))] = 
3H( Xi ) G H(Xi) : -[i?(*P(^),^)),...)] = 
-[Vff(^) G ffpQ) : i?(*P(/f(^),7f( S ,)),...)]. 

□ 

Secondly, let two quantifiers be involved, as in, 

3xeX: Wy eY : R(P(x,y, s), . . .), 

then a similar result can be proved by using that, 

HiVyieYi-.RiPix^Si), ...)) = 
Vff G ff (y) : R(*P(H(xi), H(y t ), H(s t )), ...), 

since here only one quantifier is involved. 

Applying induction, it follows that a similar result can be shown for a statement 
containing arbitrarily many quantifiers. Again the asterisk in *P may be dropped 
if no confusion can arise. Hence the following result - which is quite fundamental 
- has been proved. In its formulation R is no longer regarded as a function of 
substatements P(s), Q(t), S(u), . . ., and of sets X, X\ X", . . ., required in the 
quantifications, but simply as a function of constants and free variables X, X', 
X", . . . , s, s', s", . . ., so that P, Q, S, . . . have altogether disappeared. 

Theorem 2.7.1 (Los' theorem.) 
Let any classical statement, 

R(X, X',X",...;s,s',s",...), 

with a finite number of constants or free variables X, X', X", . . ., s, s' , s", . . ., and 
a finite number of logical connectives and quantifiers be given. Certain standard 
constants such as need not be mentioned explicitly. X, X' , X", . . . are the sets 
required to formulate the quantifications properly; that is to say that X must 
occur in 3x G X or in \/x G X, for some suitable bound variable x, and similarly 
for X' , X", . . ., and that conversely each quantification is taken care of this way. 
Then, 

H[R{X i ,X' i ,X';,...;s i ,s' i ,s'l,...)] = 
R(H(Xi),H(Xi),H(Xi'), H( Si ), H($, H{s>>), ...). 



□ 
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In other words, (regarding for the time being each constant as a free variable) 
given any classical statement, to each of its free variables q (so that here q G 
{X, X', X", . . . , s, s', s", . . .}) add the index i, which defines infinite sequences 
(<7i), and which also defines an infinite sequence of classical statements, which 
sequence in turn defines an internal statement (the one to the left). Then the 
latter is equivalent to the statement (the one to the right) that results from the 
given classical statement by replacing each free variable q by the internal free 
variable H(q,j) that is defined by the infinite sequence It may happen, of 
course, that for certain q this leads to *q, namely when q% = q for all %. Or, in 
still other words, given any classical statement, replace each of its free variables 
q by its internal version H(qi) (that might be its standard version *q), then the 
resulting statement (the one to the right) is equivalent to the statement (the one 
to the left) that results from it by removing all if's and all asterisks, and putting 
one single H in front. 

The formulation of the theorem implies that each bound variable must occur in 
some set inclusion. A more careless formulation would be allowed if it would be 
required that each bound variable is internal (which is automatically true in the 
formulation given). 

So far, bound variables were only combined with quantifications, but they can 
occur in other formulations as well. A very usual one is the definition of a set, such 
as, {x G X : P(x,X, s)}, where P(x,X,s) is some statement. Then obviously x 
is a bound variable that is not combined with 3 or V, although one says 'the set 
of all x in X such that . . . ' Anyway, the result is a set, hence a constant or a 
variable, not a statement. Yet, the theorem can be applied here and leads to the 
following corollary. 

Corollary 2.7.1 (The internal definition principle.) 

Let any statement be given, say P(x, X, s), where X is some set and x G X makes 
sense. Let X and s be internal, then so is the set 

T = {xeX : P(x,X, s)}. 

Proof: Since X is internal, also x is internal, so for suitable X i and s$, 
T = {H( Xi ) G H{X t ) : P(H(x t ), H{X t ), if (*))}. 

Let, 

Tt = {xi G Xi : P(xi,Xi, Si)}. 
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Then H(xi) G T if and only if H( Xi ) G H(Xi) A P(H(x i ),H(X i ),H(s i )), hence 
by Los' theorem, if and only if H(xi £ X, A P(xj, JQ, Sj)), hence if and only if 
H(xi G Tj), hence if and only if H(xi) G H(Tj), so that 

H(xi) G T = H(xi) G H(Ti), 

i.e. T = #(T;), hence T is internal. □ 



2.8 Transfer; the standard definition principle 

In this section a few consequences of Los' theorem will be presented that belong 
to the main tools of nonstandard analysis. 

Theorem 2.8.1 (Transfer, first formulation.) 

Let R(X, X', X", ...;s,s', s", . . .) he as before. Then, 

R(X,X',X",...;s,s',s",...) = 
J R(*X,*X , ,*X",...;*s,*s , ,*s",...). 



Proof: In Los' theorem take X t = X, X[ = X', X'( = X" ', . . ., Si = s, = s', 
s'l = s", . . ., for all i, then, 

*[R(X,X>,X",...;s,s>, s",...)} = 
R(*X,*X',*X",...;*s,*s',*s",...), 

but, 

*[R(X,X',X",...; S ,s',s",...)} = 
R(X,X',X",...;s, S ',s",...). 

□ 

One fact is disguised in this formulation of transfer, namely that in the statement 
to the right the bound variables need not be standard. A simple example may 
clarify this: 

3xeX : P(x, s) = 
3H( Xi ) G *X : P(H(xi),*s) = 3x G *X : P(x,*s), 
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where P(x, s) is some substatement. So transfer expresses the fact that any clas- 
sical statement is equivalent to the nonstandard statement that results from it 
by replacing everything by its *-transform except the bound variables. What will 
happen if the bound variables too are replaced by their *-transforms? In the 
example this gives, 

3*x G *X : P(*x,*s), 

but since *x G *X = x G X and P(*x,*s) = P(x,s), it follows that this is 
equivalent to 3x G X : P(x, s), and a similar equivalence holds for any classical 
statement. This leads to transfer in another formulation. 

Theorem 2.8.2 (Transfer, second formulation.) 

Given any internal statement, replacing everything, including every bound vari- 
able, by its standard version is equivalent to replacing everything except every 
bound variable by its standard version. □ 

The advantage of transfer in its first formulation is that it is the bridge between 
classical mathematics and nonstandard mathematics; that of transfer in its second 
formulation that one remains within nonstandard mathematics, so that one may 
forget about classical mathematics, that via the *-transform is mapped in a one- 
to-one kind of way to a certain part of nonstandard mathematics. In the second 
formulation it comes close to what it is in Nelson's internal set theory, that strictly 
speaking ignores classical mathematics completely, and that within nonstandard 
mathematics defines the difference between standard and internal (and external, 
which in essence is an irrelevant notion in this theory, however). 

Let us now see what really is the essential part of transfer. First of all it is clear 
that it is only of some value if bound variables are present. Simple examples of 
transfer in its second formulation are, 

3*x G *X : P(*x, *s) = 3x G *X : P(x, *s), 

and 

Yx G *X : P(*x, *s) = Vrr G *X : P{x, *s), 

where H(xi) has been replaced by x. Keep in mind that x G *X implies that 
x is not necessarily standard, but internal. Clearly, in one direction these two 
equivalences are trivial (at least if X C *X), and what is really important are 
the following two implications, 



3x G * X : P(x,*s) 3x G *X : P(*x,*s), 
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and 

y*x G *X : P(*x, *s) =>■ \/x G * X : *s), 

or, in words, if there exists an internal x such that something is true, then there 
even exists a standard *x such that that something is true; and if something is 
true for all standard *x, then that something is even true for all internal x. The 
second implication leads from classical mathematics towards nonstandard math- 
ematics, and the first one leads in the opposite direction back from nonstandard 
mathematics to classical mathematics. 

Obviously, in case *X = X, also the two implications are trivial, and transfer is of 
no use. Examples where this is not the case have already been given in Section 1.4. 
In that section it also became clear what really is the purpose of transfer: in a 
number of important cases the nonstandard form of a classical statement can be 
given a much simpler form (see Section 1.4, where statements (1.1) and (1.1) are 
equivalent to statement (1.1)). Inevitably, this simpler form requires the use of 
certain internal constants that are not standard (nonzero infinitesimals in the 
example of Section 1.4). 

Example: As a nontrivial example let us consider the least upper bound theorem 
for R, which in its classical form reads, 

VX G P(R) : {X ^ A [36 G R : Vr G X : x < b}} 
3(3 G R : [Vr G X : x < f3] A [Ve G R, e > : 3x G X : x > f3 - e\. 

By transfer in its first formulation this is equivalent to, 

VX G* 0P(R)) : 
{X A [3b G *R : Vr G X : x < b}} 
3(3 G *R : [Vx G X : x < 0\ A [Ve G *R, e > *0 : 3x G X : x > (3 - e], 

which by transfer in its second formulation is equivalent to, 

V*X G* (P(R)) : 
{*X ^ A [3*6 G *R : W*x G *X : *x < *b]} => 
3* (3 G *R : [\/*x G *X : *x < * (3] A [V*e G *R, *e > : 3*x G *X :*x>*(3 — *e], 

which is equivalent to the classical version of the theorem. Note that in this ex- 
ample different bound variables have been indicated by the same symbol, which is 
against the rule advocated before in Section 1.3, but this time seems appropriate. 
Also note that internal bound variables have not explicitly be indicated a such, 
simply because they are automatically internal. 
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Exercise: Apply Los' theorem to the least upper bound theorem, but with internal 
free variables. 

Just as transfer is a direct specialization of Los' theorem, the next result is a 
direct specialization of the internal definition principle. 

Theorem 2.8.3 (The standard definition principle.) 

Let *X be a standard set, x G *X make sense, *s be standard, and P(x, *X, *s) 
be any statement. Then also, 

{x G *X : P(x,*X,*s)} 

is standard. 

Proof: The proof is a simplification of the proof of the internal definition principle, 
and is left as an exercise. □ 



2.9 The *-transform of attributes 

So far the *-transform was concerned with expressions and statements. In this 
section the *-transforms of a number of attributes, such as finiteness are consid- 
ered. 

A) * finite or hyper finite sets. 

A set S = H(Si) is called * finite or hyperfinite if all Si are finite sets, or equiv- 
alently, if {i : Si is finite} G U. This does not mean that S is a finite set. 
As a counterexample, let Si = {1,2, then the smallest element of S is 
m — 1 — H(l, 1,1,...), the largest is M = H{\, 2, 3, . . .) = H{i), and those in 
between are H(si) with 1 < Sj < i. It follows that S is an infinite set, for if not 
then all of them would be at most n = H(n, n,n, . . .) for some n G N. In partic- 
ular, M < n, which is not true. Nevertheless, a hyperfinite set can be treated as 
if it were finite. For example, just as finite sets of classical numbers, hyperfinite 
sets of internal numbers have a smallest as well as a largest element, hence are 
bounded. 

Exercise: Show this. 

Another example is where u ~ oo and / is an internal function from *N to {1, 2}, 
such that /(l) = 1, f{uj) = 2. Then there exists a largest k G *HNT, 1 < k < u, 
such that f(k) = 1, hence such that f(j) = 2ifk + l<j<uj. The hyperfinite 
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set here is { j : 1 < j < H(rii)} = H(Si) with uo = H(rii) and Si = {1, . . . ,71,}. 
Such a result is trivial for finite sets, and by transfer carries over to the present 
case. For let F be the set of all classical / : IN — > {1,2}. Then that trivial result 
reads, more formally, 

VnGK:V/GF, /(l) = 1, f(n) = 2 : 3k e 1 < k < n : 
[f(k) = 1 A V? G N, A; + 1 < j < n : /(j) = 2], 

hence, by transfer, 

Vn G *N : V/ G *F, /(l) = 1, /(n) = 2 : 3k G *N, 1 < k < n : 
[/(A;) = 1 A Vj G *N, fc + 1 < j < n : /(j) = 2], 

in particular if n = u ~ oo. 

Bj * finite or hyperfinite numbers. 

A number a; = H(xi) is called * finite or hyperfinite if all are finite. But there 
is no other choice, as all classical numbers are finite. In other words each internal 
number is hyperfinite. Yet an internal number might be larger than any natural 
number; just consider H(i), or H(i — 2). Nevertheless they can be treated as 
classical numbers: H(i) — 1, (H(i)) 2 , H(i) — H(i — 2), etc. make sense. 

C) *real or hyper real. 

A number H(xj) is called *real or hyperreal if all Xj are real, that is to say if 
H(xi) G *R. Obviously, such a number need not be real, but it can be treated as 
a classical real number. 

D) * continuity or hypercontinuity. 

A function H(fi) from *R to *R is called *continuous or hypercontinuous at 
H(ci) G *R if for all i, fa is continuous at q. 

E) * countable or hypercountable. 

A set H(Si) is called *countable or hypercountable if all Si are countable. Hence 
*N is hypercountable, although it is not countable, as will be shown below. 

It should now be clear what is the purpose of this section, and that there are 
many variations of the theme. 

Exercise: Define hyperfinite and hyperinfinite sequences. 

Remark: In Nelson's internal set theory the prefix 'hyper' is not used: there 
'finite' means hyperfinite, 'countable' means hypercountable, etc. On the other 
hand 'standard finite' means finite. 
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2.10 *N, *Z, tQ, *R: main definitions and properties 

The sets *N, *Z, tQ, and *R have already been used in various examples, but 
will now be treated in a more orderly fashion. Their definitions, the rules of 
equality and identification, and the definitions of the arithmetic operations and 
the inequalities are all obvious from the foregoing theory: 

*N = Hirii) : m G N, and similarly for *Z, "Q and *R, 

H{xi) = H(yi) if and only if {i : x { = y;} G U, 

H(xi) = x if and only if {i : Xi = x} G £/, so that if (x) = *x = x, 

H(xi) + H( yi ) = H(xi + Hi),Xi, Hi G IN, 

H(xi) ■ H(yi) = H(x i .y i ),x i ,y i G IN, 

| H{ Xi ) | = ff(| x t \), Xi GZ, 

if (x*) - if (y*) = if (ajj - yi), Xi, j/j G Z, 

l/H(xi) = H(l/xi), if all Xj 7^ and x { gQ, 

if(xj) < if(yj) = if(xj < yj), Xj,y, G IN, and similarly for >,<,>. 

Trivially, here IN may be replaced by Z, Z byQ, andQ by R. As far as inversion 
is concerned, it is of course sufficient that {i : X{ 7^ 0} G U, as then the Xi that 
are can be changed to, say, 1, without that the value of H (xj) is changed. Note 
that x < y is a statement, so x < y G £>, where -B = {true, false}, hence 
H (x < y*) G *B = B. 

Definitions: Let x be a hypernumber. 

x is called positive hyperlarge if x > m for all m G N. Notation: x ~ 00. 
x is called negative hyperlarge if — x > m for all m G IN. Notation: x ~ —00. 
Instead of hyperlarge the term infinitely large may be used, but this does not 
mean that x would be equal to 00, which is not regarded a number at all. 
x is called finite or limited if it is nor hyperlarge. 

x is called hypersmall, or is called an infinitesimal if | x |< 1/m for all m G IN. 

Notation: x ~ 0, or in case x 7^ 0, x ~ 0. 

x is called appreciable if x is limited but not hypersmall. 

Theorem 2.10.1 Infinitely large numbers and nonzero infinitesimals exist! 



Proof: For example, 



H(l, 2,3,...)- 00, H(-l, -2, -3, . . .) ~ -00, 
H(l, 1/2, 1/3, . . .) ~ , H(-l, -1/2, -1/3, . . .) ~ 0. 
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n 

Clearly x = H(xi) = H(+l, -1/2, +1/3, -1/4, . . .) ~ 0, but is it positive or 
negative? This depends: if {i : Xi > 0} G U then x > and otherwise x < 0. 
Many dichotomies of this kind exist in nonstandard analysis, but they will not 
cause any trouble, because it will not take very long before generating sequences 
and the if-operator will disappear from the scene. Then references to the basic 
free ultrafilter U will no longer be required (except when it is necessary to go 
back to basic principles). 

Theorem 2.10.2 e ~ if and only if 1/e ~ +oo or 1/s ~ — oo, hence s is 
appreciable if and only if 1/s is appreciable. 

Let e ~ 0, e' ~ 0, s and s' be appreciable, and uj ~ oo, uj' ~ oo. Then, 

e + e' ~ 0, e - e' ~ 0, e ■ e' ~ 0, 

e + s and e — s are appreciable, and e ■ s ~ 0, 

£ + c<j~oo, e — ~ — oo, and e • u ~ +oo or — oo or 0, or e • uj is 
appreciable, 

s + s' and s — s' are appreciable or ~ 0, and s ■ s' is appreciable, 
s + uj ~ +oo, s — uj ~ — oo, s ■ c<j ~ +oo or — oo, 

u; + uj' ~ +oo, c<j — uj' ~ +oo or — oo or 0, or — u/ is appreciable, and 
UJ ■ uj' ~ +oo. 



Exercise: Show this, and provide examples for all possibilities in case there are 
more than one. □ 

The results of this section so far show that at long last Leibniz' theory of hy- 
persmall and hyperlarge numbers can be given a sound mathematical basis. It 
was Robinson who in 1961 for the first time formulated a complete theory of 
nonstandard analysis. See Sections 1.8 and 1.9 for more details. 

In Section 2.6 it was shown that IN is an external subset of *N. An alternative 
proof can be given by showing that IN has a property it would not have were 
it internal. This happens to be a proof technique that can be applied to many 
external notions. Some even define external sets as sets that have such a property, 
but they fail to tell what that property is given that set, so that this definition 
would not seem to be very practical. In the case of IN the property is that a 
bounded internal subset S of *N has a maximum. The proof of this statement 
is not difficult: let S = H(Si) be bounded by b = H(bi), then Si is bounded by 
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bi, at least for all i G {i : Si is bounded by 6j} which is an element of the free 
ultrafilter [/, but as usual we may assume that this is true for all i (why?). Hence 
each Si has a maximum m 8 and H(m,i) is the maximum of S. Now IN is bounded 
in * IN by any hyperlarge natural number, and if it were internal it would have a 
maximum, which it has not. Therefore, IN is external. □ 

In a similar way it can be shown that the set of all infinitesimals is external. Since 
this set is bounded in *R, say by 1, it must have a least upper bound (5 if it were 
internal. But this would imply that (5 itself would be an infinitesimal, so that 2/3 
would be an infinitesimal as well, but 2f3 > f3. 

Exercise: Show that f3 would indeed be an infinitesimal itself. 

Theorem 2.10.3 In *R the set of all infinitesimals is external. □ 

Another variation of the theme is the next result. 

Theorem 2.10.4 In *R the set of all positive hyperlarge numbers is external. 
Proof: Left as an exercise. Hint: use lower bounds. □ 



2.11 Overflow and underflow 

This section is concerned with the existence in certain internal sets of an element 
that depending on the internal set given either is infinitely large, or is limited, or 
is an infinitesimal, or is not an infinitesimal. 

Theorem 2.11.1 (Overflow or overspill.) 

Let S he an internal subset of*T, where T is either IN orTL, orQ, or R, such that 
Vm G IN : 3s(m) G S : s(m) > m, i.e. such that from a classical point of view 
S contains arbitrarily large elements, then 3s G S : s ~ oo, i.e. S contains some 
inhnitely large element. 

Proof: If V6 G *T : 3s(b) G S : s(b) > b, then take b ~ oo, which implies that 
s(b) ~ oo. If this is not true, then 3b G *T : Vs G S : s < b, so that by Los' 
theorem, H(3bi G T : Vsi G Si : s { < bi), where S = H(Si) and b = H(bi), which 
by the classical least upper bound theorem implies that, 

H(3f3i G T : [Vs, £ Si : Si < A] A [3s', G S t : s[ > A - 1]), 
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hence, again by Los' theorem, 

3(3 e *T : [Vs e S s < (3\ A [3s' g S s' > (3 - 1], 

and it follows that Vm G IN : (3 > s(m) > m, hence that (3 ~ oo, so that s' ~ oo. 

□ 

Theorem 2.11.2 (Underflow or underspill.) 

Let S he an internal subset of *T, with T as before, such that Vlu G *1N, ~ 
oo : 3s(u) G S* : s(cu) < uj A s(c<j) ~ oo, i.e. sucii that S contains inhnitely large 
elements that are arbitrarily small, then 3s G S : s is limited. 

Proof: Let Si = {s G S : s > 1}. Clearly, Si is not empty, so that by the classical 
greatest lower bound theorem, 

3(3 G *N : [Vs G Si : s > (3} A [3s' G ^ : s' < (3 + 1], 

as can be shown by an argument similar to that used in the preceding proof. It 
follows that (3 is limited, as otherwise (3 < s((3) < (3, so that s' is limited as well. 

□ 

Since in *Q and *R, x is infinitely large if and only if 1/x is a nonzero infinitesimal, 
these two theorems have the following counterparts. 

Theorem 2.11.3 ('Inverse' overflow.) 

Let S be an internal subset of %l or *R, such that Vm G IN : 3s(m) G S : 
|s(m)| < 1/m, i.e. such that from a classical point of view S contains arbitrarily 
small elements, then 3s G S : s ~ 0. 

Proof: It is no restriction to assume that 0^5. Apply overflow to S' = {t : 1/t G 
S} and use the fact that S' is internal if (and only if) S is internal. □ 

Theorem 2.11.4 (Inverse' underflow.) 

Let S be an internal subset of*Q, or *R, such that Ve, e ~ 7 e > : 3s <E S : s > e , 
then 3s G S : s is not an infinitesimal. 

Proof: Similar to the preceding proof. □ 

The overflow theorem immediately implies that S is an external subset of *S in 
case S is equal to IN, Z, Q or R, a fact we knew already. The underflow theorem 
immediately implies that *1N\1N too is an external subset of *N, and similarly 
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for the other sets of numbers. This also follows from the fact that a subset of 
a standard set *S is internal if and only if its complement with respect to *S 
is internal. The 'inverse' underflow theorem implies that the subset of *R of all 
hypersmall elements of *R is external, and similarly for *Q. 



2.12 *N and *Z: more properties 

Theorem 2.12.1 If n G *Q, then either n G IN or n is hyperlarge. 

Proof: If n — H(rii) is not hyperlarge, then n < m for some m G IN, so that, 
{i : < rii < m} G U, but then {i : = to'} G U for precisely one to' G IN, 
to' < to, as follows from the properties of £/, so that n = to'. □ 

Theorem 2.12.2 Given any a; G *N, c<j ~ oo, then S = [1,uj] is uncountable. 

Proof: The proof is given by constructing a bijection between S and the set of all 
infinite sequences of O's and l's, which is known to be uncountable. 

Given ou 1 = u>, the interval S is split into intervals So — (0, o^o] an d Si = (o;o,^i] 
of approximately equal length (see the details below), So is in a similar way 
split into Soo = (0,woo] an d Soi = (^00,^01], an d Si into Sio = ((^01,^10] and 
Su = (ujio,uju], etc., where at each split the limits of all subintervals involved 
are given by a column of: 

iO\ = LOii = Mm = ... 

U)\\Q = ... 

^10 — ^101 — ■ ■ ■ 

^100 — ■ ■ ■ 

— ^01 — <^oii — • • • 

^010 — ■ ■ ■ 

<^00 = ^001 = • • • 

^000 = 

= = = 

where W0...00 — L(^o...oi + 0) /2J , and, given that b is any string of O's and l's with 
an even positive binary value and that c is any such string with an odd binary 
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value, Ub = |_(w&-i + w&+i)/2j, w c = u^ c _iy 2 , so that, for example, 



w m = won = w i = w = [(wi + 0)/2j , 
wion = wioi = lvio = L(woi + Wn)/2J , 
woooi = w 000 = L(wooi + 0)/2j, 
Wiooi = Wioo = L(won + Wioi)/2j, 



woon = wooi = w 00 = L(woi + 0)/2j, 

Wim = Win = Wn — U>i — UJ, 

W0101 — woio = L(wooi + Won)/2j, 
wnoi = who = L(wioi + w m )/2j. 



Each of these intervals contains a hyperinfinite number of elements. For example, 
if u = H(l, 2, 3, 4, 5, 6, 7, 8, . . .), then 

S = tf((0,l], (0,2], (0,3], (0,4], (0,5], (0,6], (0,7], (0,8],...), 

S = H(<b, (0,1], (0,1], (0,2], (0,2], (0,3], (0,3], (0,4],...), 

S, = tf((0,l], (1,2], (1,3], (2,4], (2,5], (3,6], (3,7], (4,8],...), 

5 00 = if (0,0, 0,(0,1], (0,1], (0,1], (0,1], (0,2],...), 

5 01 = i?(0, (0,1], (0,1], (1,2], (1,2], (1,3], (1,3], (2, 4],...), 
S w = 7/(0,0,(1,2], (2, 3], (2, 3], (3, 4], (3, 5], (4, 6],...), 

S u = H((0,1], (1,2], (2,3], (3,4], (3,5], (4,6], (5,7], (6,8],...), etc. 

Let uj = H(rii). Given any x = H{xi) G S, it may be assumed that < Xj < n« 
for all i. Either x G So or x G Si, if x G So, then either x G Soo or x G Soi, and 
if x G Si, then either x G Sio or x G Su, etc., which defines a unique infinite 
sequence of O's and l's. If, for example, x G S , x G S i, x G S n, . . ., then the 
sequence is (0, 1,1,.. .). 

Conversely, any infinite sequence of O's and l's defines a sequence of intervals for 
some x = H(xi), inducing for each % a sequence of intervals for x^. Continuing 
the example, let i = 7, then, if the infinite sequence of O's and l's is (0, 1,1,.. .), 
the sequence of intervals for x? is, 

(0,3], (1,3], (2, 3],..., 

because (0,3] is the 7th term of the sequence generating So, (1,3] that of the 
sequence generating S i, (2,3] that of the sequence generating S n, etc., and 
subsequent terms are either (2,3] or 0, and an is followed by 0's only. Now no 
matter how large is n», the interval sequence for X{ will certainly contain a term 
of length 1, as, the length of (a, b] being b — a, at each split an interval is split 
into subintervals of proximately equal length. In fact it will take no more than 
[ 2 lognj] splits to find an interval of length 1. For each i with rii > 2 let Xi be 
the upper limit of any term of the interval sequence for X{ whose length is 1 (all 
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those terms have the same upper limit), and for each i with ni — 1 let Xi — 1. 
Then x G S. In the example, x 7 = 3. □ 

According to the construction in this proof if as before u> = H{i), then, 

(1, 1, 1, . . .) leads to x = u, 

(0,0,0,...) tox = l, 

(0, 1, 1, . . .) to x = H(l, 1, 1, 2, 2, 3, 3, . . .), 

(1, 0, 0, . . .) to x = H(l, 2, 2, 3, 3, 4, 4, . . .), etc. 

Note that the last two x's differ by 1 = H(0, 1, 1, . . .), and that the last sequence 
of 0's and l's can be found from the one but last by 'adding 1 at infinity'. 

From the construction it also follows that if the generating sequence (n«) for uo 
is nondecreasing (as it is in the example given), then the generating sequence 
(xi) for x too is nondecreasing. Now let x be any hyperlarge element of *N, 
then there exists an uj = H(rii) > x such that (n*) is nondecreasing, for simply 
let rii = max{xj : j < i}. Starting from this u and computing the x, from 
the infinite sequence of 0's and l's that corresponds to x, it follows that (xj) is 
nondecreasing as well. This shows the following corollary, that probably does not 
have much practical value. 

Corollary 2.12.1 If x G *N and x ~ oo, then there exists a nondecreasing 
inhnite sequence (xj) tending to infinity such that x = H(xi). □ 

Example: Let x = H(yj), where yi — 1 if i — 2j + 1 for some j, y { = 2 if 
i = 4j + 2 for some j, y^ — 3 for % = 8j '• + 4 for some j, etc. Hence for each n 
the number of yi — n is infinitely large, and (y^) is certainly not nondecreasing, 
and this sequence has no limit. Assume that x ~ oo, which is possible, as e.g. 
{1, 2, 4, 8, 16, . . .} could be an element of the filter U. Note that y\ — 1, y 2 — 2, 
y 4 = 4, etc. Then, 

to = H(l, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 8, . . .). 

To a; there corresponds an infinite sequence of O'l and l's. Which one is dif- 
ficult to tell, because this entirely depends on the underlying filter U . To this 
sequence there corresponds the desired nondecreasing sequence (xj), and again it 
is difficult to tell which one. Even though the desired (xj) cannot be determined 
constructively, the following result has been shown. 



Corollary 2.12.2 *N is already generated by all nondecreasing infinite sequences 
that tend to infinity. □ 
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Subtracting well chosen H(xi) and H{%)i) from each other will destroy this nice 
property, however, see Section 1.10. 

Another corollary of the theorem is, that since IN is countable, it follows that if 

00 ~ oo, then [1, uo] — IN is uncountable, and that *N and * IN — IN are uncountable 
as well. A direct proof of this is as follows. 

Proof: The proof uses a variation of Cantor's diagonal method. If *N were count- 
able, let s(n) = H(si(n)) = H(si(n), S2(n), . . .) be its ra-th element, n G IN. 
Let, 

h = 1 + si(l), * 2 = 1 + max(s 2 (l), s 2 (2)), i 3 = 1 + max(s 3 (l), s 3 (2) 3 , s 3 (3)), 

etc. Then t = H(ti,t 2 , t 3 , . . .) is larger than each s(n), including itself, as t = s(j) 
for some j, contradiction. □ 

As regards externality, recall from Section 2.6 that IN is an external set, and so 
is *N — IN, as these two sets are each others complement with respect to *N, see 
Section 2.5. Seemingly friendly functions from *1N to *N turn out to be external 
as well. For example, let / be a function from *1N to *N such that f(n) = 1 if 
n G IN and f(n) ^ 1 if n G" N. Then / is external. For if / were internal, then 
by the internal definition principle the set T = {x G *N : f(x) = 1} would be 
internal, but T = IN. After all, / is not so friendly, as the external IN is involved 
in its definition. 

Exercise: Show that / is external if f(n) — 1 if n ~ oo, and f(n) ^ 1 if not. 

Exercise: Let / be a function from *N to {1, 2} such that f(n) = 1 if and only if 
n — 1, 2, or 3. Show that / is a standard function. 

Exercise: Let / be an internal function from *N to itself, such that f(n) = 1 if 
jiGK. Show that f(n) — 1 for all n G *N and that / is standard. 

Theorem 2.12.3 Let f be an internal function from *N to {1, 2}, such that both 

1 and 2 are assumed somewhere, i.e. such that f is onto. Moreover, let f(n) = 2 
for all n>b for some b G *N. Then there is a (3 < b, (3 G *N such that f((3) = 1 
and f((3) — 2 if n > j3. Hence there is a last such that f((3) = 1, even though 
b ~ oo is allowed, and {n : n <b} is uncountable in that case. 



Proof: The set {x G *N : f(x) = 1} is internal and is bounded above by b, hence 
by the least upper bound theorem in its internal form it has a least upper bound 
f3, which must be a maximum. □ 

The results of this section have obvious counterparts for *Z. 
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2.13 and *R: more properties; standard part 
Theorem 2.13.1 (Standard part theorem.) 

If x & *R then either | i |~ oo, or x = r + e, r e R, e ~ for unique r and e. 

Proof: Let x not be hyperlarge. Then the uniqueness of r and e is easily shown, 
for let r + e = r' + e', r,r' G R, e,e' ~ 0, then r — r' ~ 0, but r — r' G R, 
hence r — r' = 0, so that r — r' and e = e'. Since x is limited, it follows that for 
some b G INT, | x |< 6. Hence 5 = {s : s G R, s < x} is a nonempty subset of 
R that is bounded above by b. Indeed, S is nonempty because —b G S. By the 
least upper bound theorem S has a least upper bound /3 G R. If /3 < x — 1/m 
for some m G N, then /3 would not be an upper bound of 5. If (5 > x + 1/m for 
some m G IN, then would not be the least upper bound of S. Therefore, for all 
m G IN, | P — x |< 1/m, i.e. (3 — x ~ 0, so that r = /? is the desired real. □ 

Definition: If x is a limited hyperreal, then the (unique) standard hyperreal r 
that is infinitely close to x is called the standard part of x, which is denoted by 
st(x). 

The theorem is false if *R is replaced by 

Counterexample: Let (xj) be a Cauchy sequence of rationals, such that R(xi) = 
y/2. Then x = H( Xi ) G "Q and x is limited. Suppose x = r + e, r G Q, e ~ 0. 
Since also x G *R, it follows from the theorem that x = v^2 and r = \[2 would 
be the only possibility, a contradiction. □ 

It follows that not every limited hyperrational has a standard part in Q. Yet 
limited hyperrationals do have a standard part, in R and in fact the entire R can 
thus be obtained. 

Theorem 2.13.2 Given any r G R, there exists an x G such that st(x) = r. 

Proof: If r G R, then there exists a sequence (r n ), r n GQ, such that r n tends to 
r if n tends to infinity, hence for all m G N, | r n — r |< 1/m if n is large enough. 
This implies that, 

Vm e N : {i :| n - r | 1/m} G £/, 

hence that x = if (r^) ~ r. □ 
Hence in some very special way *Q completely defines R. 
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Theorem 2.13.3 Let a,b G R, a < b. If x G *[a,b], then st(x) G [a, b], but if the 
interval in R is not closed, this is sometimes not true. 

Proof: Since x is limited, st(x) is well denned. If st(x) G" [a, b], then either st(x) = 
a — 5 or st(x) = b + 5 for some 5 G R, 5 > 0. But x—st(x) ~ 0, hence either 
x < a — 5/2 or x > b + 5/2, but then x is not in *[a, b]. 

If, for example, x G* (a, 6], let a« in R converge to a such that a < ai < b, then 
x = H(ai) G* (a, 6] but st(x) = a g (a,b\. □ 



2.14 An alternative to introducing *Z, *Q, and *R 

In Sections 2.1 and 2.10 the following scheme for the introduction of *N, *7L, 
and *R was used, 



IN - 


-> Z - 


- Q 


R 


I 


I 


I 


I 








*R 



but there is an alternative, namely, 



IN 


Z 


Q 


R 


1 


T 


T 


T 


*N - 


-> *z - 


-> - 


-> *R 



Extending *N to *Z directly. Consider pairs (m,n) of elements m and n of *1N, 
and let these pairs generate constants Z'{m,n), subject to exactly the same iden- 
tification and equality rules as were given in Section 2.1 for Z{m,n), m, n G IN. 
Let Z' = {Z'(m,n) : m,n G *1N}. In Z' (non) negativity, (non)positivity, absolute 
value, addition, subtraction, multiplication and the inequalities are defined in 
exactly the same way as they were defined for Z. 

A 'natural' bijection between Z' and *Z can now be established as follows. Given 
any z' G Z', there are infinite sequences (wij), and (rij), m^rii G IN, such that 

z'^Z'iHim^^im)). 

Each pair (mj,nj) defines the element Z(mi,rii) G Z, hence the two sequences 
define the element z = H(Z(mi,rii)) G *Z. So each z' G Z' defines a z G *Z. 
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Conversely, each z G *Z defines a z' G Tt. It is not difficult to see that the rejec- 
tion that is implied by this preserves (non)negativity, (non)positivity, absolute 
value, addition, subtraction, multiplication and the inequalities, so that Z' may 
be identified with *Z by identifying corresponding z' and z. 

Extending *Z to directly. This time consider pairs (m, n) of elements m and 
n of *Z, and let these pairs generate constants Q'(m,n), subject to exactly the 
same identification and equality rules as were given in Section 2.1 for Q(m,n), 
m,n 6Z. Let Q' = {Q'(m,n) : m,n £ For the elements of Q' everything 
is defined in exactly the same way as it was done for the elements ofQ. Now a 
bijection between Q' and *Q, can be established that preserves everything, so that 
Q' may be identified with ! tQ. 

In the preceding section we have shown that given any r G R there exists an 
x G *Q, such that st(x) = r. An alternative proof is now as follows. Given any 
r G R, Vn G 7L : 3m G 7L : m < nr < m + 1, hence, by transfer, 

Vn G *Z : 3m G *Z : m < n • *r < m + 1, 

but *r = r, hence m/n < r < m/n + 1/n. Take n ~ oo, then, as \jn ~ 0, 
st (m/n) = r, where x = m/n GQ', as m,nG *Z. 

Extending *Q, to *R directly. This direct extension is more involved than the pre- 
ceding two because in order to generate R fromQ infinite sequences (i.e. Cauchy 
sequences) of rationals are required, whereas for the preceding two extensions 
only pairs of natural numbers or integers were required, for recall that the in- 
ternal version of a pair is still a pair, but that the internal version of an infinite 
sequence is not an infinite sequence. What is needed are internal Cauchy se- 
quences with terms in Now what is an internal Cauchy sequence in the first 
place? A classical sequence (r(n)) of rationals r(n) is a Cauchy sequence if, 

Mm G IN : 3k G IN : \/n,p G IN, n,p > k :| r{n) — r{p) \< 1/m. 

Consequently, an internal Cauchy sequence (r(n)), n G *N, r(n) G *Q, is charac- 
terized by classical Cauchy sequences (rj(n)), n G N, Tj(n) G Q, i = 1,2,3, .. ., 
such that r(H{rii)) = H(ri(ni)), hence by the hyperstatement, 

if[Vraj G IN : 3^ G IN : Mn^Pi G IN, n h pi > k t :| r^rij) - r^Pi) \< l/m^], 

which can be simplified to, 

#[Vm G IN : 3k G N : Vn,p G IN, n,p > k :| r^n) - n(p) \< 1/m]. 

In an analogous way, internal concurrency between the internal Cauchy sequences 
r(n) and s(n) is characterized by the hyperstatement, 

H[Wm G IN : 3k G IN : Vn G IN, n > k :| r^n) - Si(n) |< 1/m]. 
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Now let each internal Cauchy sequence r(n) of hyperrationals generate a constant 
R'{r{n)) and let, 

R' = {R'(r(n)) : n G *HNT, r(n) G tQ, and (r(n)) is an internal Cauchy sequence.} 

Equality. R'{r{n)) = R'(s(n)) if and only if r(n) and s(n) are internally concur- 
rent. 

Identification. R'(r(n)) = r if for all n G N (hence for all n G *N), r(n) = r for 
some r 6 tj. 

The definitions of absolute value, addition, subtraction, multiplication, division 
and the inequalities are similar to those given before in Section 2.1 and all this is 
preserved by the bijection between R' and *R that is defined as follows. Let x' G 
R', then x' = R'(r(n)) is generated by the internal Cauchy sequence r(H(n,i)) = 
H(ri(rii)), which for each i defines the classical Cauchy sequence (rj(n)), n G *1N, 
which in turn defines Xi = R(ri(n)) G R, and hence x = H(xi) G *R. Conversely, 
each x G *R defines an x' G R', for if r^n) is given for all n G IN, then this 
defines r^n) for all n G *N. Therefore, R' can be identified with *R. 



2.15 Getting away with generating sequences and H(si)', 
summary 

Recall from classical analysis that the real numbers were introduced by means 
of Cauchy sequences of rational numbers. In more formalistic mathematics a real 
number 'is' the set of all Cauchy sequences concurrent with a given one, but in 
practice one is seldom working with these sequences. In a similar way we have been 
introducing the hypernumbers, say the hyperreals, by means of infinite sequences 
(x^ of reals x^. Rather than letting a hyperreal be some set, it was preferred 
to let it be something new, i.e. H(xi), generated by (x,), but a more formalistic 
procedure could have been followed just as well. Anyway, so many facts regarding 
nonstandard mathematics have become known in the preceding sections that, just 
as in classical real number analysis, we can in most cases do without generating 
sequences. This section serves to summarize these facts. 

Just as in classical mathematics in nonstandard mathematics there are the well- 
known notions of number, set, n-tuple, standard, internal and external notions. 
Any standard notion is internal, but no notion can be internal and external at 
the same time (see the figure in Section 1.6). There exists a bijection between the 
collection of all classical notions and the collection of all standard notions. If s is 
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a classical notion, then *s, called its *-transform, is the corresponding standard 
one. Although the notation if(sj) will largely disappear, the asterisks will not. 
Elements of internal sets are internal, but subsets of internal sets may or may 
not be internal. For example, {1,2,3} is an internal, but IN is an external subset 
of *N. When external sets directly or indirectly enter the definition of, say, some 
function, then the latter may turn out to be external as well. 

A fairly complete list of all results found sofar that are free from the if-operator 
is given below. 

1. A constant is internal if and only if it is an element of some standard set. 

2. Let S be a classical set such that *s = s for each s E S. Then *S = S if 
and only if S is finite. Otherwise S C *S. 

3. Let S be a classical set such that each of its elements is a classical set of 
which each element is equal to its *-transform. Then *S — S if and only 
if S is finite and all its elements are finite. A similar results holds if S is 
of any level. 

4. Let / be a function from X onto Y, and assume that X is finite, that 
*x = x for all x G X and that *y = y for all y G Y. Then *X = X, 
*Y = Y and */ = /. 

5. Let g : W — > X and / : X — > Y, both g and / be surjective, and assume 
that X is finite, that *x = x for all x G X and that *y = y for all y G Y. 
Then, 

*(fog)(H(w t )) = {fo*g){H(wi)) for all H{ Wi )e *W, and 
*(fog)(*w) = (fog)(w)ior a \\weW, 

even if W is not finite. 

6. Summary of useful results. 

a) *0 = 0. 

b) *S = *T if and only if S = T, 
*S ^ *T if and only if S ^ T, 
*s G *S if and only if s G S, 
*S C *T if and only if S C T, 
and similarly for C, D and D. 

c) *(SUT) = *SU*T, 

*(Sr)T) = *sn*T, 
*(S-T) = *S-*T, 
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if S C T then *(S C ) = (*S)°, where taking the complement is with 
respect to T and *T, respectively. 

d) *(s,t) = (*s,*t), 

and similar equalities hold for n-tuples, n — 3, 4, 

e) *(f(x)) = *f(*x). 

f) •[(/<#)(«;)] = (*/°*<?)(*H- 

7. If 5 is a classical infinite set of numbers, then S is external. 

8. If A is a set of numbers, the inclusion *(V(A)) C P(*t4) always holds and 
is strict if and only if A is an infinite set, and if so, V(*A) contains an 
element that is an external subset of A. 

9. (The internal definition principle.) Let any statement be given, say P(x, X, s), 
where X is some set and x G X makes sense. Let X and s be internal, 
then so is the set, 

{xeX : P(x,X, s)}. 

10. (Transfer, first formulation.) Let R(X, X', X", . . . ; s, s', s" ', . . .) be a given 
statement with constants or free variables X, X', X", . . . and s,s',s", . . ., 
and bound variables x, x', x", . . ., where X, X', X", . . . are sets, and where 
x occurs in either 3x e X or \/x G X, and similarly for x', x", . . .. Then, 

R(X,X',X",...; s,s',s",...) = 
R(*X,*X',*X",...; *s,*s',*s",...). 

11. (Transfer, second formulation.) Given any internal statement, replacing 
everything, including every bound variable, by its standard version is 
equivalent to replacing everything except every bound variable by its stan- 
dard version. 

12. (The standard definition principle.) Let *X be a standard set, x G *X 
make sense, *s be standard, and P(x,*X,*s) be any statement. Then 
also, 

{x G * X : P(x,*X,*s)} 

is standard. 

13. Infinitely large numbers and nonzero infinitesimals exist. 

14. e ~ if and only if \ je ~ +oo or 1/e ~ — oo, hence s is appreciable if and 
only if 1 /s is appreciable. 
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Let e ~ 0, e' ~ 0, s and s' be appreciable, and uj ~ oo, uj' ~ oo. Then, 

e + e' ~ 0, e - e' ~ 0, e ■ e' ~ 0, 

5 + s and £ — s are appreciable, and e • s ~ 0, 

£ + c<j~oo, e — uj ^ — oo, and £ ■ c<j ~ +oo or — oo or 0, or 

e • uj is appreciable, 

s + s' and s — s' are appreciable or ~ 0, and s • s' is appreciable, 
s + u; ~ +oo, s — uj ~ — oo, s • ~ +oo or — oo, 

uj + a/ ~ +oo, — cj' ~ +oo or — oo or ~ 0, or uj — uj' is appreciable, 
and uj • uj 1 — +oo. 

15. In *R the set of all infinitesimals is external. 

16. In *R the set of all positive hyperlarge numbers is external. 

17. (Overflow or overspill.) Let S be an internal subset of *T, where T is either 
IN or Z, orQ, or R, such that Vm £ IN : 3s(m) £ S : s(m) > m, i.e. such 
that from a classical point of view S contains arbitrarily large elements, 
then 3s £ S : s ~ oo, i.e. S contains some infinitely large element. 

18. (Underflow or underspill.) Let S be an internal subset of *T, with T as 
before, such that Vcj £ *N, uj ~ oo : 3s(uj) £ S : s(uj) < uj As(uj) ~ oo, i.e. 
such that S contains infinitely large elements that are arbitrarily small, 
then 3s £ S : s is limited. 

19. ('Inverse' overflow.) Let S be an internal subset of %l or *R, such that 
Vm £ IN : 3s(m) £ S :] s(m) |< 1/m, i.e. such that from a classical point 
of view S contains arbitrarily small elements, then 3s £ S : s ~ 0. 

20. ('Inverse' underflow.) Let S be an internal subset of 'tQ or *R, such that 
We, e~0:3s£ 1 S':s>£, then 3s £ S : s is not an infinitesimal. 

21. If n £ *1N, then either n £ IN or n is hyperlarge. 

22. Given any uj £ *1N, uj ~ oo, then S = [l,uj] is uncountable. 

23. Let / be an internal function from *1N to {1,2}, such that both 1 and 2 
are assumed somewhere, i.e. such that / is onto. Moreover, let f(n) = 2 
for all n > b for some b £ *N. Then there isa/3<6, /9£*N such that 
f(f3) = 1 and f(n) — 2 if n > f3. Hence there is a last such that f(f3) = 1, 
even though b ~ oo is allowed, and {n : n < b} is uncountable in that 
case. 
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24. (Standard part theorem.) If x G *R then either | x |~ oo, or x = r + e, 
r G R, e ~ for unique r and e. 

25. Given any r G R, there exists an x G *Q such that st(x) = r. 

26. Let a, b G R, a < b. If x G *[a, b], then st(a;) G [a, b], but if the interval in 
R is not closed, this is sometimes not true. 



Chapter 3 

Some applications 



3.1 Introduction and least upper bound theorem 

The aim of this chapter is to show how many definitions and proofs of elementary 
calculus can be simplified by means of nonstandard analysis. Only a number 
of important examples will be considered. A much more complete treatment is 
Keisler [26], where the existence of nonstandard numbers is taken for granted, 
however, and a simplified form of transfer is introduced in an axiomatic kind of 
way. 

Theorem 3.1.1 (The least upper bound theorem.) 

Let S be a nonempty subset of R that is bounded above by some (classical) real 
number. Then S has a least upper bound in R. 

Proof: Taking any c G S, instead of S we may consider {s : s G S, s > c}, 
that is to say we may assume that s > c for all s G S. Then c, b G R, c < b, 
exist such that Vs G S : c < s < b, so that, by transfer, Vs G* S : c < s < b. 
Let uj G* IN, uj ~ oo be arbitrary and divide *[c, b] in oo equal subintervals of 
length 5 = {b — c)/u, so that 5 ~ 0, and consider the points a, a + 5, a + 25, . . ., 
a + u5 = b. Then, 

3j G *N : [Vs G *S : s < a + j5] A [3s' G *S : s' > a + jS - 5}. 

Let (3 =st(a+jS), which is well defined as a+j5 is limited. Then (3 is a (hence the) 
least upper bound of S. For first of all if s G £ then s G hence s < a+j<5 = 
for some e ~ 0, but since s, /3 G R this means that s < (5. And secondly, if 
were a smaller upper bound of 5, then (3 > (3' + 1/m for some m G IN, hence 
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(3' > s' > a+j5 — 5 = fi + e — 5 > /3' + l/m+e — 5, or 5 — e > 1/m, a contradiction. 

□ 

Note that this proof is not much shorter than its classical counterpart, that 
essentially runs as follows. Let a± — a, bi — b and n — 1. 

(*) K S C [ 

0"n) (fin ~\~ ^n)/2], then let a n +i — flnj ^n+i — (fln ~\~ bn)/2, otherwise let 
a n+ \ = (a n + b n )/2, b n + 1 = b n . In either case replace n by n + 1 and start again 
from (*). 

This procedure defines two concurrent Cauchy sequences (a„) and (b n ), both 
converging to the least upper bound of S (the reader may work out the details). 

Since the theorem is wrong if R is replaced byQ, both proofs must use something 
that is typical for R. Indeed each limited element of *R (not *Q) has a standard 
part, and any Cauchy sequence converges to some element of R (not Q). This 
illustrates the obvious fact that a nonstandard proof must contain all essential 
steps - perhaps in disguise - of the corresponding classical proof. 



3.2 Simplifying definitions and proofs of elementary 
calculus 

First of all recall from Sections 1.4 and 1.5 that a function / from R to R is 
continuous at c G R if, 

Vs e R,e > : 35 e R,5 > : Vrr G R, | x - c |< 5 : | f(x) - f{c) \< e 

or, equivalent ly, if, 

Ve G *R, £ > : 35 G *H,5 > : Vrr G *R, \ x - c\< 5 :\ *f(x)-*f(c) |< e 
or, equivalent ly, if, 

\/5 G *R, 5 ~ : */(c + 5) - 7(c) ~ 0. 
The first simplification, therefore, reads as follows. 

Theorem 3.2.1 (Simplified definition of the continuity of real-valued real func- 
tions.) 

f : R — > R is continuous at c G R if and oniy if, 

V5 G *R, <5 ~ : 7(c + 5) - 7(c) ~ 0. 

□ 
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Theorem 3.2.2 (Simplified definition of uniform continuity.) 
f : S — > R, S C R is uniformly continuous in S if and only if, 

Vx,ye*S,x-y~0:*f(x)-*f(y)~0. 



Proof: Recall that / : S — > R, 5 C R is uniformly continuous in S 1 if, 

Ve G R,e > 0,35 G R,5 > : \/x,y G S, \ x-y \< 5 : | f(x) - f(y) \< e. 

By transfer, this is equivalent to, 

Vs G *R, e > : 35 E *R, 5 > 0, 
Vx,ye*S,\x-y\<6:\*f(x)-*f(y)\<e. 

But this can be simplified to, 

Vx, y G *S, x - y ~ : - ~ 0. 

For let (3.1) be true and let m G IN be given arbitrarily. Then there exist e G R, 
e > such that £ < 1/m, and 5 as in (3.1). Hence Vx,y E S, \ x — y \< 5 : 
I /(a) - f(y) \< I/™, or, by transfer, 

Vx,y e*S, \ x - y \< 5 :| 7(x) - *f(y) \< 1/m, 

so that, as m was arbitrary, — *f(y) — 0, which proves (3.1), since in 

absolute value any infinitesimal is smaller than 5. 

Conversely, let (3.1) be true, let e G R, e > be arbitrary and let 5 ~ 0, 5 > 0. 
Hence if x,y G *<S', | rr — y |< 5 then *f(x) — *f(y) = e' for some e' ~ 0. In other 
words, since | e' \< e, 

35' e*H,5' > :\/x,y e*S,\ x -y \< 5' : \ *f(x) - *f(y) \< e, 

(take, for example 5' = 5) or, by transfer (in the opposite direction), 

35' G B.,5' > : \/x,y G S, | x-y |< 5' : | *f(x)-*f(y) |< e, 

which proves (3.1) since £ was arbitrary. □ 

Theorem 3.2.3 If f is continuous at each x G [a, b], a, 6 G R, a < 6, then / is 
uniformly continuous in [a, b}. 
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Simplified proof: Let x,y be in *[a, b], then by Theorem 2.13.3, st(x) G [a, b] and 
st(y) G [a, b}. Let x — y ~ 0. Since y—st(y) ~ it follows that st(y) ~ 0. By 
continuity *f(y)-*f(st(y)) ~ and *f(x)-*f(st(y)) ~ 0, so that *f(x)-*f(y) ~ 
0. □ 

Theorem 3.2.4 (Simplified limit definition.) 

Let f : R — > R, then lim /(x) = fc, c, A; G R if and only if, 

V5 G *R, 5 ~ : */(c + 5) - fc ^ 0, 
so that fc =st[*/(c + <J)]. 

Proof: By definition, the limit exists if and only if, 

G R,£ > : 35 G R,<5 > : Mx G R, <| x - c \ < 5 : | /(x) - fc |< e, 
or, by transfer, 

G *R,£ > : 35 G *R,5 > : Vx G *R,0 <| x-c \< 5 : | - fc |< e, 

which can be simplified to, 

V5 G *R,6~ : *f(c + 5) - fc ~ 0. 

□ 

Exercise: Complete this proof. 

Now let / : IN — > R, so that / is an infinite sequence, and let c be replaced by 
oo. 

Theorem 3.2.5 (Another simplified limit definition.) 

Let f : INT — > R and G R, then lim /(n) = fc, k G R, if and only if, 

Vn G *N, n ~ oo : */W - A; ~ 0, 

so that fc =st[*/(n)]. 

Proof: By definition, the limit exists if and only if, 

We G R, £ > : 3n' G IN : Vn G IN, n > ri : | /(ra) - fc |< e, 
or, by transfer, 

We G *R,e > : 3n' G * N : Vn G *N,n > n' : | */(n) - fc |< e, 
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which can be simplified to, 

\/n G *N, n ~ oo : *f(n) - k ~ 0. 

□ 

Exercise: Again complete the proof. 

Exercise: Treat the cases where the limit itself is infinite. 

Theorem 3.2.6 If f is a nondecreasing infinite sequence, that is bounded above, 
then f{n) has a finite limit for n tending to oo. 

Classical proof: The set {f{n) : n G IN} is bounded above, hence in R has a 
least upper bound f3, so that f(n) < f3 for all n G IN, and for each m G IN, 
f(n') > (3 — 1/m for some n' G IN, and since / is nondecreasing this implies that 
lim fin) =13. □ 

Exercise: Give a direct nonstandard proof similar to that of Theorem 3.1.1 (the 
least upper bound theorem), not using Theorem 3.1.1. 

Theorem 3.2.7 (The intermediate value theorem.) 

If a, b G R, a < b, and f(a) < 0, f(b) > 0, then f(c) = for some c, a < c < b. 



Simplified proof: See Section 1.4. □ 
Theorem 3.2.8 (The extreme value theorem.) 

Let f : [a, b] — > R, a, b G R, a < b, and let f be continuous at each point of [a, b}. 
Then f\x) < f(c) for some c G [a, b] and all x G [a, b], i.e. f has a maximum 
somewhere in the closed interval between a and b. And similarly for minimum. 



Simplified proof: Let u G *N, u ~ oo, be arbitrary, and divide *[a, b] in uj equal 
subintervals of length 5 = (b — a)/uo. Let n G *N be such that *f(a + n5) > 
*f(a + iS) for all i — 0, 1, . . . , uj. The existence of n follows by transfer, since any 
finite set has a maximum, hence so has any hyperfinite set. Obviously, a + n5 is 
limited, hence c =st(a+n5) is well defined and by continuity, * f(a+n5)— f(c) = e 
for some e ~ 0. Each a; G [a, 6] is within the distance S of some a + iS and 5 ~ 0, 
hence, again by continuity, f(x) = * f(a + i5) + e' for some e' ~ 0, hence, 

f\x) < *f(a + nS) +e' = /(c) + e + e', i.e. /(x) < /(c). 

□ 



106 



Theorem 3.2.9 (The composite function theorem.) 

Let g(w) be defined for w in a neighborhood of c G R, and let f(x) be defined 
for x in a neighborhood of g{c). Then fog is continuous at c if g is continuous at 
c and f is continuous at g(c). 

Simplified proof: Let S ~ 0, then *g(c + S) — g(c) ~ 0, hence it follows that 
*f(*g(c + S))-f(g(c))^0. □ 



3.3 Continuity and limits for internal functions 

So far nonstandard characterizations were given for continuity and limits of clas- 
sical functions. How about arbitrary internal functions? Let / be some internal 
function from *R to *R, and let c G *R, so that c may be hyperlarge or 'almost 
standard' i.e. be the sum of a real number and a nonzero infinitesimal. Or rather, 
let F be the set of all classical functions from R to R and let / G *F. Then, by 
definition D) of Section 2.9, for suitable f\ and q, / = H(fi) is Continuous at 
c = if (q) if for all i G N, /j is continuous at q. Here /j : R — > R and q G R. 

Theorem 3.3.1 (Continuity of internal functions.) 

The internal f : *R — > *R is * continuous at c G *R if and only if in the classical 
definition R is replaced by *R, i.e. if, 

Ve G *R,e > : 35 G *R,5 > : Vrr G *R, | x-c \< 5 :| f(x) - /(c) |< e. 

Proof: Letting / = if (/i) and c = if(cj), / is Continuous at c if and only if, 

if [Vei G R, > : 35j G R, <5j > : 
V^i G R, | Xi - q |< Si : | - fi(ci) \< £i\. 

By Los' theorem (Theorem 2.7.1) this is equivalent to, 

VH(e l ) G *R, H(e t ) > : 3H(S t ) G *R, H(Si) > : 
VH( Xl ) G *R, | H( Xi ) - H( Cl ) \< H(8i) : | H(fi)(H( Xi )) - H (fi)(H(*)) \< Hie,) 

and hence to what has to be proved. □ 

Warning: If / or c is nonstandard, Continuity is not always equivalent to, 

VS G *R, S ~ : /(c + 5) - /(c) ~ 0. 



Counterexamples: 
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a) c standard, but / nonstandard; let cu ~ oo, f(x) = tux, c = 1, 5 = 1/tu 1 ^ 2 ; 
then /(c + <5) — /(c) = tu5 ~ oo. 

b) / standard, but c nonstandard; let /(x) = x 2 , <5 ~ 0, c = l/<5; then 
/(c + 5)-/(c)=2 + 5 2 ~2. 

Yet, V5 G *R, 5 ~ : f(c + 6) — f(c) ~ makes sense for arbitrary internal / 
and c. If this is true, then / is called S-continuous at c. 

Examples: 

a) Let a be a positive infinitesimal, and f(x) — ax if x > 0, /(x) = if 
a; < 0. Then / is ^-continuous everywhere in *R. It is Continuous at 
c G *R if c 7^ 0, but not at c = 0. 

b) Let u) ~ oo and f(x) = tux. Then / is nowhere ^-continuous, since /(x) — 
f(c) = tu(x — c) = tu 1 / 2 if x — c = cj -1 / 2 ~ 0. It is continuous everywhere 
in *R. 

c) f(x) = x 2 . Then / is not ^-continuous if c ~ oo. It is Continuous every- 
where in *R. 

Theorem 3.3.2 The internal function f : *R — > *R is S-continuous at c G *R 
if and only if, 

e R,£ < : 35 E R,<5 > : e *R, | x-c \< 5 : | /(x) - /(c) |< e. 
(Note that both e and 5 are standard, but that x is internal.) 

Proof: The if part. Let e G R, e > and S G *R, 5 ~ be given arbitrarily. Then 
there is a 5' G R, <5' > such that, 

Vrr G *R, | re - c |< 5' : | f(x) - /(c) |< e. 

As | 6 | < 8' , so that | x — c |< <5' if x = c + 5, it follows that | f(c + S) — /(c) |< e, 
and since e is arbitrary that /(c + 5) — /(c) ~ 0. 

The only-if part. Conversely, let e and 5 be as before but such that 5 > 0. Then 
Vx G *R, | x - c |< 5 : | /(x) - /(c) |< £. Now let the set S be defined by, 

S = {5 G *R : 5 > and Vx G *R, | x - c |< 5 : | /(x) - /(c) |< e}. 

Then if <5 G 5 every number between and 5 is in S. By the internal definition 
principle (Corollary 2.7.1) S is internal, but it contains as a subset the set of all 
positive infinitesimals and the latter is external (see Theorems 2.10.3 and 2.11.4), 
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so that S must contain some 5 > that is not an infinitesimal, and it follows 
that S must contain some 5 > 0, 5 G R. □ 

Remark: In this proof the fact that an external set is not internal has been 
used. This fact is called Cauchy's principle. It is an example of a principle of 
permanence. In general this is a statement that if some set S contains some subset 
T, the latter is strictly contained in S, because S and T happen to be different 
kinds of set. In classical mathematics such principles do not seem to play a real 
part, but in nonstandard mathematics there are several of them, although there 
are only a few primary forms, or perhaps only one, i.e. Cauchy's principle, which 
obviously really is a matter of definition. See Section 4.1 for more details. 

In a similar way the *-transform of uniform continuity can be introduced: simply 
copy the classical definition and replace R by *R. And the simplified form of 
uniform continuity leads to S'-uniform continuity. Hence the internal / : *R — > *R 
is S -uniformly continuous in *R if, 

Vrr, y G *R, x - y ~ : f(x) - f(y) ~ 0. 

Exercise: Formulate and show a theorem similar to Theorem 3.3.2, but for uniform 
continuity. 

It is generally agreed to drop the asterisks in both Continuity and *uniform 
continuity, as well as in similar indications. But keep in mind that S'-continuity 
is then not a special form of continuity, and similarly for other attributes. 

Turning to Timits, the definitions similar to those in Section 3.2 are, dropping 
the asterisks: the internal / : *R — > *R tends to the limit k G *R for x G *R 
tending to c G *R, if, 

ye G *R,e > : 35 G *R,5 > : G *R, <| x-c |< 5 : | f(x)-k |< e. 

And the internal / : *N — > *R tends to the limit k G *R for n G *N tending to 
infinity, if, 

Ve G *R, e > : 3n' G *N : Vn G *N, n > n : | f(n) — k \< e. 

Similar to S'-continuity the definitions of S-limit are as follows. The internal 
/ : *R -> *R tends to the S-limit k G *R for x G *R tending to c G *R, if, 

V5 ~ : f(c + 5) - k ~ 0. 

And the internal / : *N — > *R tends to the S-limit A;G*RfornG*N tending 
to infinity, if, 

Vn G *N,n ~ oo : f(n) - k ~ 0. 
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Theorem 3.3.3 The internal f : *R -> *R tends to the S -limit k G *R for 
x G *R tending to c G *R, if and only if, 

G R,£ > : 35 G R,5 > : G *R,0 <| re - c |< 5 : | /(x) - k \< e. 



Proof: Left as an exercise. □ 

Theorem 3.3.4 The internal f : *N -> *R tends to the S-Iimit fc G *R for 
n G *N tending to inhnity, if and only if, 

V£ G R, £ > : 3ri G N : Vn G *K, n > ri : | /(n) - fc |< e. 



Proof: Left as an exercise. □ 

A special case arises when fc is finite, because then st(fc) is well defined. Then 
also f(x) or f(n) are finite for x close enough to c or n large enough. Assuming 
in the first case that c is finite as well, it follows that the classical limits for x 
tending to st(c) or n tending to oo are equal to st(fc). 

Theorem 3.3.5 Let f, c and k be as before, and let k and c be finite. If f(x) 
tends to k for x G *R tending to c, then, 

lim st(f(x)) = st(k), where x G R, 

x~*St(c) 

and if f(n) tends to k for n G *N tending to oo, then, 
lim st(f(n)) = st(fc), where n G IN. 



Proof: In the first case 



3<&i G R,5i > : \/x G *R,0 <| x-c |< <?i :| f(x)-k |< 1, 

so that Vrr G R, <| x— st(c) |< 5i/2 :| /(x) |<|st(fc) | +2, which means that 
/(#) is finite for these x's, from which the first claim follows. The second claim 
is shown in a similar way. □ 

Exercise: Treat the cases where the limit itself is infinite. And define the corre- 
sponding S'-limits. 
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3.4 More nonstandard characterizations of classical 
notions 

Theorem 3.4.1 (Nonstandard characterization of Cauchy sequence.) 
s(n) is a classical Cauchy sequence if and only if, 

Vn, p G *1N, n, p ~ oo : *s(n) — *s(p) ~ 0. 

Proof: By definition, 

Vm G N : 3k G INT : Vn,p G !N,n,p > fc :| s(n) - s(p) \< 1/m, 
and by transfer, fixing m G IN and G IN, 

Vn,p G !N,n,p > k :| s(n) — s(p) |< 1/m, 

is equivalent to, 

Wn,p G *N, n,p > /c :| *s(n) — *s(p) |< 1/m. 

Now let n,p ~ oo, so that automatically n,p > k, no matter the value of A; G *1N, 
then (3.1) implies that, 

Vm G IN : 3A; G IN : n,p G *¥\,n,p ~ oo :| *s(n) - *s(p) |< 1/m. 

But since k plays no part any more this can be simplified to, 

Vm G IN : Wn,p G *1N, n,p ~ oo :| *s(n) — *s(p) |< 1/m, 

hence to, 

Wn,p G *1N, n,p ~ oo : Vm G IN :| *s(n) — *s(p) |< 1/m, 

hence to, 

Vn, p G *1N, n, p ~ oo : *s(n) — *s(p) — 0, 

which is (3.1). 

Conversely, consider the negation of (3.5), that is, 

3m G IN : VA; G IN : 3n,p G ¥i,n,p > k :\ s(n) — s(p) |> 1/m, 



Ill 



fix m G IN and apply transfer to, 

Vfc G IN : 3n,p G N,n,p > /c :| s(n) — s(p) |> 1/m, 

giving, 

VA; G *N : 3n,p G *N,n,p > fc :| *s(n) - *s(p) |> 1/m, 
which implies, fixing k ~ oo arbitrarily, that n,p ~ oo, hence, 

3n,p G *N,n,j9 ~ oo :| *s(n) — *s(p) |> 1/m, 
so that (3.1) implies that, 

3m G IN : 3n,p G *N,n,p ~ oo :| *s(n) — *s(p) \> 1/m, 

or, 

3n,p G *]N,n,p ~ oo : 3m G IN :| *s(n) — *s(p) |> 1/m, 

or, 

3n, p G *1N, n,p ~ oo : -i[*s(n) — *s(p) — 0], 
which is the negation of (3.4). □ 

Theorem 3.4.2 (Nonstandard characterization of bounded set.) 

Let L he the set of all limited elements of *R. Then S C R is hounded if and 

only if* S C L. 

Proof: S is bounded if, 

3m G IN : Vs G 5 :| s |< m, 

hence, by transfer, if, 

3m G IN : Vs G *£ :| s |< m, 

so that *S* C L. 

Conversely, if S is not bounded then, 

Vm G IN : 3s G 5 :| s |> m, 

hence, by transfer, 

Vm G *N : 3s G *5 :| s |> m, 

and taking m hyperlarge it follows that | s \ is hyperlarge for some s G *S', so 
that s £ L. □ 
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Theorem 3.4.3 (Nonstandard characterization of open set.) 

Let S C R and let h(S) = {t G *R : t ~ s for some s e S}. Then S is open if 

and only if, 

h(S) C *S. 

Proof: S is open if, 

Vs G 5 : 3m G IN : Vi G R, | t - s |< 1/m : t € S, 
hence, by transfer, 

Vs G 5: 3m G N : W G *R, | t — s |< 1/m : t <E*S, 
so that, restricting t such that t ~ s, 

Vs G 5 : Vi G *R, t ~ s : t e *S , i.e. /i(S') C *S. 

Conversely, if 5 is not open, then, 

3s G 5 : Vm G IN : 3t G R, | t - s |< 1/m : t G" 5, 

hence, by transfer, 

3s G S* : Vm G *N : 3t G *R, \ t — s \< 1/m : t *S, 

and taking m hyperlarge it follows that for some s G 5 and some t G *R we have 
that £ ~ s, but t G" *S, hence that /i(S') is not a subset of *£. □ 

Remark: h(S) is called the haio (or the monad) of S. 

Theorem 3.4.4 (Nonstandard characterization of closed set.) 
S C R is closed if and only if, 

h{S c ) C *(S C ) = (*S) C . 

Proof: Follows directly from the previous theorem. □ 
Exercise: Show the last theorem independently of the previous theorem. 

Theorem 3.4.5 (Nonstandard characterization of interior point.) 

Let s G R and let h(s) = {t G R : t ~ s}. Tien s is an interior point of S C R 

if and only if /i(s) C *S\ 
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Proof: Since s is an interior point of S if, 

3m G N : Vt G R, | t - s |< 1/m : t G 5, 

the proof is a simplified version of that of Theorem 3.4.3. The details are left as 
an exercise. □ 

Remark: In view of the previous remark, h(s) is of course called the .halo of the 
point s. Note that h(0) is the set of all infinitesimals. 

Theorem 3.4.6 (Nonstandard characterization of boundary point.) 

s G R is a boundary point of S C R if and only if both h(s) fl *S and h(s) H (*S)° 

are nonempty. 

Proof: If s is a boundary point of S then, 

Vm G IN : [3t G R, | t - s |< 1/m : t G 5] A [3t G R, | t - s \< 1/m : t g" S] 
hence, by transfer, 

Vm G *N : [3t G *R, | t - s |< 1/m : * G *5] A [3t G *R, \t — s \< l/m:t& *S] 
so that, taking m ~ oo, 

[3t :i ~s,t G *5] A[3t :t ~ s,t 
hence /i(s) n *5 ^ and /i(s) n (*5) c ^ 0. 
Conversely, if s is not a boundary point of S, then, 

3m G N : [Vt G R, | t - s \< 1/m : t £ S] V [Vt G R, | t - s \< 1/m : t G 5], 
hence, by transfer, 

3m G N : [Vt G *R, | t - s |< 1/m : t *S] V [Vt G *R, | t - s |< 1/m : t G 

The first substatement between square brackets implies that if t ~ s then t G" 
so that /i(s) C (*S') C , and the second one similarly that h(s) C *S f , so that either 
/i( s ) n *5 = or /i(s) n (*5) c = 0. □ 

Theorem 3.4.7 (Nonstandard characterizations of accumulation point and clo- 
sure.) 

s G R is an accumulation point (or limit point) of S C R if and only if, 

3t E*S,t^ s : t ~ s. 
Let cl S be the closure of S. Then s G cl S if and only if, 

3t G *S : t ~ s. 
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Proof: If s is an accumulation point of S then, 

Vm G N : 3t G S,t ^ s :| t- s \< 1/m, 

hence, by transfer, 

Vm G *N : 3t G *S,t ^ s :| t - s |< 1/m, 

so that, taking m hyperlarge, 3t G *S, t ^ s: t ~ s. 
Conversely, if s is not an accumulation point of S, then, 

3m G IN : Vt G 5, t ^ s :| t - s |> 1/m, 

hence, by transfer, 

3m G IN : Vt G *S, t ^ s :| t- s \> 1/m, 

so that, 

Vt e *S,t y£ s : -.[t ~ s]. 

The second part of the theorem follows by observing that s G cl 5 if and only if 
s G 5 or else if s is an accumulation point of S. □ 

Exercise: Give an alternative proof of Theorem 3.4.4, using the fact that S is 
closed if and only if S = cl S. 



3.5 Inverse functions; b c 

Recall that a function / : S — > T has an inverse f~ l if and only if / is bijective, 
and then f~\t) = s if f(s) = t. 

Theorem 3.5.1 Let a function f he monotonically increasing (or decreasing) 
and be continuous in [a, b], a,b G R, a < b. Then, 

1 ) range (f), the range of f, is a finite closed interval, 

1) f has an inverse, 

1) f^ 1 too is monotonically increasing (or decreasing), and 

1) f^ 1 is continuous in its domain. 

Proof: Only the case where / is increasing is considered. 
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1) As a < x < b implies that f(a) < f(x) < f(b), range (/) C [/(a), /(&)]. 
And if /(a) < w < f(b) then by the intermediate value theorem (The- 
orem 3.2.7) there is a c G [a, b] such that /(c) = w which means that 
[f(a)J(b)] C range (/). Therefore, range (/) = [/(a), /(&)]. 

2, 3) Clearly, / _1 exists, [/(a),/(6)] is its domain, and it is increasing. 

4) Let w = /(c) G [/(a), /(&)], and let e ~ 0. If e > then, of course, 
u> must be smaller than f(b), and if e < then w must be larger than 
f(a). Assume e > 0. Note that *(/~ 1 ) = so that parentheses are 

not required here. Let d = * f~ 1 (w + e) so that d > c. If d were not 
infinitesimally close to c, then d > c+l/m for some m G N. As c, m G R. 
/(c+ 1/m) > /(c) + l/n for some n G IN, so that *f(d) > f(c + l/m) > 
/(c) + l/n — w + l/n, but */(c') = w + e, hence e > l/n, a contradiction. 
It follows that d ~ c, which shows the continuity of / _1 at ty. □ 

The next subject of this section is the introduction of 6 C , for b, c G R, b > 0. If 
c G Q this can best be done in the classical way, using the functions x c and b x 
as well the properties of inverse functions. Hence begin with x n , n G IN, x G R, 
x > 0, which is monotonically increasing and continuous, leading to the definition 
of x x l n as its inverse, and hence to x m ^ n , m, n G IN, x G R, x > 0, either as (x 1 ^) 171 
or as (x m ) l / n . To see that the two are identical, note that (y n ) l / n = y, so that, 
taking y = (x l / n ) m it follows that, 

(((x lln ) m ) n ) 1/n = (x 1/n ) m , 

and taking y — x it follows that, 

(((x 1/n ) m ) n ) 1/n = ((x) m ) 1/n = (x m ) 1/n . 

For c G Q, c > 0, let x c = l/x~ c , and in view of x c ■ x d = x c+d , let x° = 1. Then 
i c , cGQ, i G R, i > is increasing if c > 0, decreasing if c > 0, and equal to 1 
if c = 0. 

Next consider b x , b G R, b > 0, x G Q, which is now well defined, then b x is 
increasing if 6 > 1, decreasing if b < 1, and equal to 1 if b = 1; and continuous at 
each i eQ. 

For c G R there is a nonstandard alternative. Given any c G R, by Theo- 
rem 2.13.2, c = st(c') for some d G *Q, where d is determined uniquely up 
to a hyperrational infinitesimal. Now let, 

9(c) = st(6 c '), 

then g(d) = b c , where, b c = \imb x , x GQ. 
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To see this note first of all that g(c) is defined uniquely, for if e ~ 0, e G *Q, then 
st(6 c '+ £ ) = st(6 c > st(6 £ ) = st(6 c ') -1. 

Now let d GQ, such that c = d + r for some r > 0, then g(cf) = 6 d . Since d = c + e 
for some e~0, c' — d + r + e and, 

^(c) - g(d) = st(b d+r+£ ) -b d = b d - [st(6 r+£ ) - 1]. 

In this product the first factor is positive, and the second one is positive (or 
negative) if b > 1 (or b < 1), and zero if b = 1. A similar result is obtained 
if r < 0. Hence the function g is monotonous if not constant, so that if c is 
between the rationals d and e, then g(c) is between b d and b e . This shows both 
the monotonicity and the continuity of g(x), x G R, and hence that g(c) = b c . 

□ 

Exercise: Show that st(p • q) = st(p)- st(g) and that st(6 e ) = 1 if b > and e ~ 0, 
( Q. 



3.6 Differentiation 



Let / : *R -> *R, c e *R, then / is said to be diSerentiable at c if for some 
k G *R, 

lim f(x) - flc) exists and is equa. to k. 

x ^ c x — c 

Then this limit, which obviously is a * limit, is called the derivative of / at c, and 

df ( c] 

usually k is replaced by f'(c) or by — — . 

In case everything is standard, this definition becomes the classical definition of 
differentiability, and from Section 3.3 it follows that / : R — > R is differentiable 
at c G R if for some k G R, 

o 



so that, 



f'(c) = St 



>f(c + S)-f(c) 



This means that f'(c) is infinitesimally close to a quotient, justifying to a cer- 
tain extent calling f\c) a differential quotient, even though f'(c) is a limit of a 
quotient. 
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As before this need not be true if /, and c are internal; then / is called S- 
differentiable at c if for some internal k, 

V^0: /(C + ^ /(c) ^ = /'(c). 

From the definition of /'(c) in the standard case it follows that, 

\/5^0:*f(c + 5)-f(c)-f'(c)-5 = r5, 

for some r ~ 0. This is known as the increment theorem. Conversely, if, for some 
k G R and, 

\/5 ~ : *f(c + 5) — f(c) — k5 = t5, for some r ~ 0, 
then / : R — > R is differentiable at c G R and /'(c) = ^- 

Theorem 3.6.1 

1) If / : R — > R is differentiable at c G R, tiien / is continuous at c. 
1) ff f is differentiable at c, and b G R, tiien / + b and b ■ f are differentiable 
at c as well, and, 

U + b)'{c) = f{c),{b.f)'{c) = b.f{c). 

1) ff both f and g are differentiable at c, then 

(/ + g)'(c) = /'(c) + g'(c), (/ ■ «?)'( C ) = f'(c) ■ g(c) + f(c) • </(c), 

and, if g(c) ^ 0, then, 

{f/9)\c) = (f(c)-g(c)-f(c).gf(c)W(c). 

Note that in the last case that g(x) ^0ifx = c + 5, for 5 ~ 0, as p 
is continuous at c. In particular, if g{x) = l/f(x), and /(c) 7^ 0, then, 
//f) = -/'(c)// 2 (c) ; ^ 
Ij Chain rule, ff g is differentiable at c, and f is differentiable at g(c), then 
the composite function F = fog, too is differentiable at c, and, 

F\c) = Uog)\c) = f\g{c)).g\c). 

Proof: 

1) Follows immediately from */(c + 5) — /(c) — /'(c) ■ 5 = tS, so that, 

*f(c + 5) - /(c) ~ if 5 ~ 0. 



118 



2) and 3) are left as exercises. 

4) For any S ~ 0, and for some fi ~ 0, 

m Mc) + S)-f(g(c))-f(g(c))-6 = »6 

Let, given any £ ~ 0, 

5 = *g( c + e) - g{c) = g'(c) -e + re, 

for some r ~ 0. Then 5 ~ and, 

7(*<?(c + e)) - /(5(c)) - /'(5(c)) • (</(c) + r) • £ = tf, 

hence, 

7(*<?(c + f(9(c)) - f'(g(c)) V(c)-e = 
/'(ff(c)) • re + /x(^(c) • e + re) = r'e, 

for some r' ~ 0, which means that fog is differentiable at c, and that its 
derivative at c is equal to f'(g(c)) ■ g'{c). □ 

Theorem 3.6.2 (The critical point theorem.) 

Let X be an interval of R, c G X , f : X — > R be continuous at each x & X , and 
c be a maximum or a minimum of f over X . Then c is an endpoint of X , or f'(c) 
is undefined, or /'(c) = 0. 



Proof: Let c be a maximum of / over X. If c is not an endpoint of X and if /'(c) 
exists, then, 



\/5 ~ : /'(c) = st 



7(c + 5)-/(c) 



Taking first 5 positive and then negative this gives that /'(c) < and /'(c) > 0, 
hence /'(c) = 0. □ 

Theorem 3.6.3 (Rolle's theorem.) 

Let f be continuous in [a,b], a,b G R, a < b, and be differentiable in (a,b). 
Moreover, let f(a) = f(b) = 0. Then /'(c) = for some c G (a, b). 



Proof: The proof is classical in nature, and relies on Theorems 3.2.8 and 3.6.2. 

□ 

The proofs of the next corollaries too are classical in nature, and therefore are 
not given in detail. 
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Corollary 3.6.1 (Mean value theorem.) 

Let f he as in the previous theorem, except /(a) = f(b) — need not he true. 
Then, 

/'(c) = - /(a) for some c G (a, b). 



Proof: The proof follows by applying Rolle's theorem to g, defined by, 

g(x) = f(x)-f(a). m - f{a) -(x-a). 

o — a 



□ 



Corollary 3.6.2 (Generalized mean value theorem.) 

Let f and g both be continuous in [a, b], and be differentiable in (a, b). In addition 
let g'(x) ^ for all x G (a, b). Then, 

/'(c) f(b)-f{a) f , 

=-t4 = ^^7^ — for some c e (a, & )- 
^(c) #(&) - ^(a) 



Proof: The proof follows by applying Rolle's theorem to h, defined by, 
h(x) = f(x)(g(b) - g(a)) - g(x)(f(b) - /(a)). 

□ 

Corollary 3.6.3 (Taylor's theorem.) 

Let f and /' both be continuous in [a,b], and let /" exist in (a, 6), then, 
f(b) = /(a) + /'(a)(6 - a)/l! + /"(c) (6 - a) 2 /2!, for some c G (a, 6), 
and similarly for higher derivatives. 

Proof: The proof is based on the previous corollary. □ 
Theorem 3.6.4 (L'HopitaFs theorem). 

Let both f and g be differentiable, and g'(x) ^ in a neighborhood (a, b) of 



120 



c G R with the possible exception of the point c itself. Also assume that lim 

x^c 

f( x ) 

fix) = lim qix) = 0, and that lim — — — exists and is equal to k. Then, 

Jy ' x^c yW x->c gl( x ) H 

lim M = k. 

x ^ c g{x) 

Similar statements hold true if k and/or c are replaced by oo. 



Proof: Only the case as stated will be proved. 

Without loss of generality it may be assumed that f(x) and g(x) are defined at 
x = c and that f(x) = g(x) = 0, so that / and g are continuous at c. It follows 
that the generalized mean value theorem can be applied to both [c, c + 5] and 
[c — 5, c], for 5 > and close enough to zero. Taking 5 ~ this gives that, for 
some Ci and c 2 , c < c\ < c + 5, c — 5 < c 2 < c, 

fM = /(c + *)-/(c) and fM = /(c) - f_(c - 5) 
g'(c±) g(c + 5)-g(c) g'(c 2 ) g(c)-g(c-5)' 

Since both c\ and c 2 are infinitesimally close to c this implies that these two 
expressions are infinitesimally close to k and the theorem follows. □ 



3.7 Integration 



Only Riemann integration of continuous functions will be considered, even though 
the extension to more general functions is not very difficult. 

Let a, b G *R, a < b and let / : [a, b] — > *R be a continuous function. Then, by 
definition, the Riemann integral of / over [a, b] is, 

rb n 

J— j f(x)dx— lim f(q + i(b — a)/n){b — a)/n. 
In case everything is standard, then, given any u G *N, ~ oo, 



J = st 



STf(a + i(b-a)/u)(b-a)/ 



CO 



Note that in this case dx = (b — a)/u> ~ and dx > 0. And the usual formulation 
is, 



J = st 



/(a + i • dx)/dx 
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Instead of the point a + i ■ dx any other point Xi G [a + (i — l)dx, a + i ■ dx] may 
be selected, i = 1, . . . , u and J becomes st(S'), where, 

U! 

i=l 

a so-called Riemann sum. That S is finite follows from the extreme value theorem 
(Theorem 3.2.8), as this theorem implies the existence of m,M G R such that, 

G [a, 6] : m < f(x) < M, 

so that m(b — a) < S < M(b — a) and S is indeed finite. 

The notation of the integral correctly suggests that J depends neither on uj, nor 
on how the selected. Indeed, let yi some other element of the interval 

[a + (i — l)dx, a + % ■ dx], then if u is not changed it follows from the uniform 
continuity of / and dx < 5 for all 5 G R, 5 > 0, that, 



VeeR,e>0: 



i=i 



< uj • e • dx = e(b — a), 



hence that the expression to the left of the last inequality is an infinitesimal, which 
shows that J does not depend on the selection of the Xi. And if u>i, u>2 ~ oo, and 
for k = 1, 2, dxk = (b — a)/uk and, 



S k = f( a + i ' dx k)dx k , 



i=l 



let uj = oj\u]<i and let dx and S be as before, then, 

S — S\ = *^2 5Zt/( a + {(* — + j}dx)dx — f(a + iu>2dx)dx], 

i=l j=l 

and again this is an infinitesimal, as is S — S2, and hence so is S\ — S*2. 

In case a > b, then by definition J^f(x)dx = — f£ f(x)dx, and if a = b, then 

As remarked before / need not be continuous in order for J to exist. Also in 
what follows derivatives need not always be continuous, but let us not go into 
the details as this chapter is only concerned with showing the possibilities of 
nonstandard analysis. 



The definition of improper integrals is as in classical analysis: just consider the 
appropriate limits. 
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Exercise: Present the details of introducing improper integrals, complete with the 
simplified forms in case everything is standard. 

In the remainder of this section it is assumed that everything is standard. 
Theorem 3.7.1 If a < b < c, then, 

cb 



rc rO rc 

/ f(x)dx = / f(x)dx + / f(x)dx. 

Ja Ja Jb 



Proof: Follows from the fact that the standard part of a sum is equal to the sum 
of the standard parts. □ 

Theorem 3.7.2 F defined by F(x) = f(t)dt, a < x < b, a < b, is contin- 
uous. Moreover, the derivative of F(x) exists and is equal to f(x), a < x < b. 
Conversely, if G is such that G'(x) = f(x), then, 

F(b) = [ b f{x)dx = G(b) - G(a). 

J a 



Proof: The continuity of F follows from the fact that, if 8 ~ 0, 

h<5 rx rx+5 



rx-\-o rx rx-\-6 

/ f(t)dt- / f(t)dt= / f(t)dt, 

J a J a J x 



no matter whether 5 is nonnegative or negative, in which case the right-hand side 
too is an infinitesimal. 

The second part of the theorem follows from the fact that m and M exist as 
before such that, 

rx+dx 

m-dx< F(x + dx) - F(x) = / f(t)dt < M ■ dx. 

J X 

If dx ~ 0, dx > 0, m and M can be taken infinitesimally small, so that for some 
5-0, 

F(x + dx) - F(x) = f(x)dx + 5 ■ dx, 
which gives the desired result. 

To show the last part of the theorem, note that for some constant c, F(x) = 
G(x) + c, a < x < b, and that F(a) = 0. □ 

The proofs of the next two theorems are not particularly of a nonstandard nature, 
and for that reason are kept rather short. 
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Theorem 3.7.3 (Substitution rule.) 

Let F'(x) = f(x) for x e [a, b], and let x = g(w) for w e [a, (5] such that g maps 
[ct,f3] onto [a,b], with g(a) = a, g((3) = b. Also assume that g is continuously 
cliff erentiable. Then, 

f(x)dx = / f(g(w))g'(w)dw. 



a J a 



Proof: From Theorem 3.6.1 it follows that, 

(F(g(w))y = F'(g(w))g'(w) = f(g(w))g'(w), 

hence, 

f f(x)dx = F(b) - F(a) = F{g{(3)) - F(g(a)) = (\F{g{w)))'dw 

J a J a 

/ f{g{w))g {w)dw. 

J a 



Theorem 3.7.4 (Integration by parts.) 

If both f and g are continuously differentiable in [a, b], then, 

b rb 

f\x)g\x)dx = f(b)g(b) - f(a)g(a) - / f(x)g(x)dx. 

a J a 



Proof: Since (f(x)g(x))' = f(x)g'(x) + f'(x)g(x) it follows that, 

rb 

J = / (f(x)g(x))'dx, 



n 



exists and is equal to the sum of the two integrals in the statement of the theorem. 
But J — f(b)g(b) — f(a)g(a). □ 



3.8 Pitfalls in nonstandard analysis 



In this section it is shown by means of a number of examples that some care is 
required when applying nonstandard analysis. 



The existence of a positive infinitesimal e means that, 

3e G *R : Vm 6N:0<£< 1/m. 

By transfer (?) this would give, 

? 3eeR:VmGK:0<£< 1/m ?, 

which obviously is not true. The cause of the trouble is that in the first 
statement the constant IN is external, so that transfer is not allowed. 

Any nonempty subset X of IN has a smallest element, that is, 

3X C N, X : 3x G X : \/y G X : x < y, 

which by transfer (?) would lead to, 

? VX C *N, X ^ : 3x G X : G X : x < y ?, 

which is wrong as can be seen be seen by taking, for example, X = *1N — IN 
(which is external), for if x were the smallest element of X, then also 
x — 1 G X, so rr — 1 < x. The correct procedure would be: 

VX G V(fi), X ^ : 3x G X : \/y G X : x < y, 

which by transfer gives, 

VX G *(P(N)),X ^ : 3x G X : \/y G X : x < y, 

so that X must be internal. The latter is indeed true as could be shown 
by returning to first principles, i.e. to write *(P(N)) as {H(X 1: X 2 , . . .) : 
X ^ IN}, so that, if X{ is the smallest element of Xj, H(x±,X2, ■ ■ ■) is the 
smallest element of H(X 1 , X 2 , . . .). 

As is well-known R has Archimedian order, that is, given any x G R there 
is an n G IN such that n > x, or, 

Vr G R : 3n G IN : n > x. 

But *R has no such order, 

? Vr G *R : 3n G IN : n > x ?, 

for take any x ~ oo. Indeed, transfer would give, 

Vr G *R : 3n G *N : n > x, 

which is correct. One might say that *R has hyper- Archimedean order. 
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In Section 1.4 it was shown by means of transfer that statements (1.2) 
and (1.3) are equivalent: 

^ G *H,e > : 35 G *H,5 > : G *R, | x-c |< 5 : | *f(x)-* /(c) |< £ 
and, 

V5 G *R, 5 ~ : */(c + 5) - */(c) ~ 0, where c G R and / : R -> R. 

Note that *c = c. Replacing c by a nonstandard constant or replacing */ by 
a nonstandard, but internal, function, this equivalence may be destroyed. 
Examples were already given in Section 3.3. 

Let S be a bounded subset of R, then S has a least upper bound in R. 
Hence, by transfer (?), the set of all infinitesimals in *R would have a least 
upper bound (3 in *R, which it has not, because (3 and hence 1(3 would 
have to be infinitesimals themselves, but 2(3 > (3. Transfer is illegal here 
because, by Theorem 2.10.4, the infinitesimals form an external set. 
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Chapter 4 



Some special topics 



4.1 Principles of permanence 

In the proof of Theorem 3.3.2 the fact was used that an external set is not inter- 
nal, and in the remark below that proof that this fact, which is called Cauchy's 
principle, is a principle of permanence. In general the latter is the statement that 
if a certain property P holds for all elements of a certain set A, it must hold for 
at least one element not in A: 

[Va G A : P(a)] -> [36 g A : P(6)]. 

The statement is based on the fact that an incompatibility between the property 
and the set would exist in case the property would only hold for the elements of 
the set. For example, let the set be IN and let the property be being an element 
of some given internal subset S of *1N: 

[VaeN:aeS]=^[36£]N:6eS]. 

If S would only contain the elements of IN there would be an incompatibility as 
S is internal and IN is external. In other words S must contain some uo ~ oo. 
The aspect of permanence here lies in the fact that belonging to S is necessarily 
carried over from the classical natural numbers to certain hyperlarge numbers. 
From this point of view Cauchy's principle would not seem to be a very explicit 
example of a principle of permanence, and indeed some authors restrict the term 
'principle of permanence' to cases where if something is true for all elements of 
some set, it must be true for some element outside that set. On the other hand 
all principles of permanence different from Cauchy's principle can be based on 
the latter (even Fehrele's principle to be discussed below). 

It would be wrong to conclude that in the example S would contain all uj ~ oo, 
since taking uj q ~ oo arbitrarily, S defined by, 

S = {n : n E *N,n < u }, 
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is internal. The latter is easily shown by returning to the basic theory, because, 
letting lu = H(n oi ), 

S = {H(rbi) : rii < n oi } = {H(rii) : G Si}, where Si = {n : 1 < n < n oi }. 

Note that the example is a special case of overflow (see Theorem 2.11.1), and 
that overflow, as well as underflow, may be seen as principles of permanence. 

Theorem 4.1.1 If f : X — > *R is an internal function such that \/x G X : 
f(x) ~ 0, then sup xeX \ f(x) |~ 0. 



Proof 1: Since f(x) is bounded over X, the supremum exists. Denote it by f3. 
Then \/5 > : 3x £ X :\ f(x) \> P - 6, that is (5 < 5+ | /(x) |. Let 5-0, then 
it follows that (3 ~ 0. □ 

Proof 2: Let / = {/3 G *R : [Vx G X :| /(x) |< /?]}. Then I is internal, but it 
contains the external subset {f3 G *R : /3 > 0, f3 is not an infinitesimal}, hence, 
by Cauchy's principle, it must contain some (5 ~ 0, from which the result follows. 

□ 

Note that the first proof, which is classical in nature, is to be preferred. Never- 
theless, the second proof is a good illustration of applying Cauchy's principle. 

Corollary 4.1.1 Let [a, b] be some interval of *R, such that b — a is limited, and 
let f and g both be Riemann integrable functions over [a, b] such that f(x) ~ g(x) 
for all x G [a, b] . Then, 

b rb 

f(x)dx ~ / g{x)dx. 



Proof. Let (3 = sup a<x<b \ f(x) — g(x) \. According to the theorem (3 ~ 0, hence, 



< 



rb rb rb rb 

/ f(x)dx — / g{x)dx < \ \ f(x) — g{x) \ dx < (3 ■ dx = (3{b — a), 

J a J a J a J a 



which is an infinitesimal. □ 

In Cauchy's principle an external set is 'confronted' with an internal set, but there 
is another principle of permanence (in the more general sense of the term) where 
two external sets that are of different kinds are 'confronted' with each other. A 
typical example of an external set of the first kind is the set of all infinitesimals 
in *R, and a typical example of an external set of the second kind is the set of 
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all limited numbers in *R. Obviously, one must find a general rule from which it 
follows that these two sets are indeed of a different kind. An external set is of the 
first kind if it is a halo, and it is of the second kind if it is a galaxy. The notions 
'halo' and 'galaxy' are defined below, and the second principle of permanence is 
Fehrele's principle: 

No halo is a galaxy, hence no galaxy is a halo. 

A set H is called a halo if there exists an uo G *N, uj ~ oo, such that, 

1) H — r\ ne ^S(n), where (5(n)), n G [1, a;]) is a hyperfinite internal sequence 
of internal subsets 5(n) of some given standard set (such as *1N or *R or 
something totally different), and, 

2) H is external. 

The second requirement is not superfluous, for take all S(n) equal to some fixed 
internal set. 

Obviously, the sequence involved may be an infinite sequence, because for any 
uj ~ oo it contains a hyperfinite sequence with [1,uj] as its domain. It is also 
allowed to define 5(n) for n G IN only, even though the sequence would then be 
external. For, with the operator H as in the basic theory, let S(n) = H(Si(n)), 
n G IN and define T{n) for n G *N by, T(H(m)) = #(5;«)), so that T = 
and T is internal. Moreover, T(n) = H(Si(n)) = 5(n), n G N, hence T extends 
5 as a function with domain IN to a function with domain *N. 

It is no restriction to assume that the sequence (S(n), n G [1, a;]) is nonincreasing, 
i.e. that 5(1) D 5(2) D ... D S(u), for if this is not the case, let, 

S'(n) = n k < n S{k),n G [l,ou], 

then, 

(S'(n),n G [1,^]) is nonincreasing, and H = n„ GN S"(n). 

It may even be assumed that the sequence is strictly decreasing, i.e. that 5(1) D 
5(2) D ... D S(uj) (perhaps for some other uj). For let, 

K = {n G [l,u] : [3p(n) > n : 5(n) D 5(p(n))]}, 

then IN C if, because otherwise 3n' £ N : Vj) > n' : 5(n') = 5(p), hence 
ii = S(n'), but ii is external and S(n') is internal. Moreover, by the internal 
definition principle, K is internal, hence EL/ ~ oo : a;' G if , i.e. [l,u/] C K. Now 
let mi = 1, m 2 = p(mi), . . ., and S"(n) = 5(m n ), n G [1, uj'], then, 

(S"(n),n G [1,^']) is strictly decreasing, and ii = n n£ ^S" (n). 
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Conversely, if if = fl ne N»S'(n) and (S(n), n G [l,w]), where u> ~ oo, is strictly 
decreasing, then H is automatically external, for let M = {n G *N : if C 15(71)}. 
If if were internal, then, by the internal definition principle, M would be internal 
as well, but M — IN and IN is external. To see that M = N observe that IN C M 
and suppose that if C for some u; ~ oo, then if C C S(n) for all 

n G IN, so that if C S^a;) C n n6 N-S'(n) = if, a contradiction. This proves the 
next theorem. 

Theorem 4.1.2 if is a .halo if and only if H = r\ ne ^S(n), where (S(n), n G 
[l,u;]) for some uj ~ oo is a strictly decreasing internal sequence of internal sets 
S(n). □ 



A set G is called a galaxy if there exists anwG *1N, u ~ oo, such that, 

1) G = U ne Nf (ft-), where (T(n), n G [1,^]) is a hyperfinite internal sequence 
of internal subsets T(n) of some given standard set, and, 

2) G is external. 

As before the sequence may be an infinite internal sequence or be defined for 
n G IN only. 

Also the sequence may assumed to be nondecreasing and even strictly increasing, 
as can be seen by an argument similar to the one leading to the preceding theorem. 

Theorem 4.1.3 G is a galaxy if and only if G = U n£ ^T(n), where (T(n), n G 
for some u ~ oo is a strictly increasing internal sequence of internal sets 
T{n). □ 



Remarks: 

1. The given standard set is arbitrary, and hence may be some abstract set. 
This shows the generality of the two definitions, where numbers only play 
a part in the definitions of the two notions, and even this can be weakened, 
as can be seen from the next remark. 

2. The definitions can be generalized by replacing *K by some standard 
index set *A, and letting S and T be internal functions mapping the a *A 
to internal sets S(a) and T(a). Then ff = fl agj 4 S(a) and G = U agy iT(a). 

Theorem 4.1.4 (Fehrele's principle.) No halo is a galaxy, hence no galaxy is a 
halo. 
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Proof: (Van den Berg.) Assume to the contrary that some halo H is equal to 
some galaxy G. Let if = r\ ne ^S(n) and G = U ng ]\jT(n), with S(n) and T(n) 
internal, (S(n)) nonincreasing and (T(n)) nondecreasing. Then T(n) C S(n) for 
all n G IN. Let I = {n G *N : T(n) C S(n)}. Then f is internal, as follows from 
the internal definition principle. Also IN C f , so that, since N is external, u £ I 
for some u ~ oo, so that S'(n) D S^u;) D f 0^) D T( n ) for all 77 G IN, hence 
if D D T(u) D G, or = = = G, so that H = G would be 

internal. □ 

Exercise: Show that the subset G of an internal set S is a galaxy if and only if 
S — G is a halo. 

Corollary 4.1.2 (Robinson's lemma.) 

Let (s(n), n G *N), s(n) G *R, be an internal sequence, such that s(n) ~ for 
all ra G IN. Tnen, 

3cj G ~ oo : V/c G *N, k <uj : s(k) ~ 0. 



Proof 1: (Van den Berg.) Let H = {ii 6 *N : [V/c < n : s(ife) ~ 0]} and G = IN. 
Then G is a galaxy and if D IN = G. If if is external, then G C if (by Fehrele's 
principle), and if ff is internal then trivially G C if , hence G C if anyway. Let 
u £ H — G, then u; ~ oo, and s(fc) ~ for all k < uj. □ 

Proof 2: (Robinson; in time preceding the first proof, and using Cauchy's princi- 
ple.) Let S = {n G *N : [Vfc < n :| s(fc) |< 1/A;]}, then 5 is internal and SDN, 
hence (by Cauchy's principle) S D IN, so that, 

3u G *N,cj ~ oo : VA; < to :| s(fc) |< 1/Jfe. 

Let < a;. If ~ oo, then s(fc) ~ 0, as then 1/k ~ 0, and if /c G IN, then by 
assumption s(fc) ~ 0. □ 

Corollary 4.1.3 (Dominated approximation.) 

Let f, g and h he functions from *R to *R, Riemann integrable over (— oo, +oo). 
Let f and g be internal but h be standard. Assume that f(x) ~ g{x) for all 
limited x, and that | f(x) \, \ g(x) \< h(x) for all x G *R. Then, 

/+oo r+oo 
f{x)dx ~ / g{x)dx. 
-oo J —oo 
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Proof: Given any n G INT, let (3 — sup | f(x) — g(x) |, then (3 ~ 0, as follows from 

\x\<n 

Theorem 4.1.1, and Corollary 4.1.1 implies that, 



/+n r+n 
f(x)dx ~ / g(x)dx, 
-n J —n 

so that, by Robinson's lemma, there exists an uj ~ oo such that, 

/+UJ r+uj 
f(x)dx ~ / g(x)dx. 
-UJ J —UJ 

Since, as h is standard, J< x i >U} h(x)dx ~ 0, it follows that, 

r r+ui r+oo 

/ f(x)dx ~ and / f(x)dx ~ / f(x)dx, 

J\x\>uj J —UJ J — oo 



r+uj f+oo 

f(x)dx ~ and / 

I\x\>uj ' 

and similarly for g instead of /, from which the desired result follows. □ 



Exercise: Show that if (s(n), n G *N), s(n) G *R, is an internal sequence such 
that Vn G IN : s(n) \< 1/n, then 3a; ~ oo : Vn G [1,oj] : s(n) |< 1/n. 



4.2 The saturation principle 

The saturation principle is concerned with infinite sequences of internal sets and 
does not hold in classical mathematics. An infinite sequence (S(n), n G IN) of 
sets - internal or not - has the finite intersection property if n% =1 S(k) ^ for all 
n G IN. 

Theorem 4.2.1 (Saturation.) 

Let the infinite sequence (S(n), n G N) of internal sets S(n) have the finite inter- 
section property, then the intersection of all of them is nonempty, i.e. n n( z^S(n) ^ 
0. 

Proof 1: (Not using permanence.) Let S(n) = H(Si(n)), where H is the H- 
operator of the basic theory. For all n G IN, let, 

T(n) = D% =1 S(k), and T^n) = D^S^k), 

then for n > 2, T(l) D T(2) D . . . D T(n), hence, 

{t:T t (l)DT l (2)D...DT l (n)}eU, 
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where U is the basic free ultrafilter. Also {i : Tj(n) 7^ G U}, and since i : i < n} 
is a finite set, 

Qn = {1 :i > n, Ti(l) D T,(2) D...D T^n), T^n) + 0} G 17. 

Obviously, Q n 5 <3n+i- For i G Q2, « > 2, let n» be the maximal n > 2 such that, 

T,(l) D Ti(2) D . . . D Ti(n), T,(n) ^ 0, and n < %. 

This n £ is well-defined. Then {n,2,n 3 , . . .} is not bounded, because if it were, so 
that rii < m for some m G N, then Q m +i = 0, but Qm+i £ ^ ■ Now for each 
i G Q2, * > 2 take Sj G Tj(nj) and take s« arbitrarily otherwise, then for each 
n G IN, if(sj) G S'(n), because for each n G IN there is an rii > n, so that as 
Si G Ti(rii), also s, G T;(n) C S^n), and if(s,) G S'(n), as Q 2 elf. □ 

Proof 2: (Using Cauchy's principle, and more elegant.) First extend the given 
sequence to an internal sequence as indicated below the definition of halo (this is 
not necessary in the first proof). Now let Q = {n G *1N : r\^ =1 S(k) 7^ 0}, then Q 
is internal and Q D IN, hence Q D N and, 

3u ~ 00 : fX =1 S(k) ^ 0, so that certainly n fceN S(k) ^ 0. 

□ 

Note that the second proof leads to a more general result, which in fact is a 
principle of permanence. 

In classical mathematics a counterexample to the theorem is, for example, the 
sequence (S(n)) where S(n) — {n, n + 1, . . .}, n G N. 

Corollary 4.2.1 Let A be a given internal set, and (S(n)) an infinite sequence 
of internal subsets of A. If for all n G IN, U% =1 S(k) ^ A, then U ke ^S(k) ^ A. 
Hence if the union of any finite number of S(n) does not fill up A, then the union 
of all of them does not fill up A. 

Proof: The proof follows from the fact that (US(n)) c = (~)S c (n), where c denotes 
complementation with respect to A. □ 

Corollary 4.2.2 Given an infinite sequence (S(n)) of internal sets, then S = 
Uk^S (k) is internal if and only if there exists an n G IN such that S—U^ =1 S(k). 

Proof: The if-part follows immediately. Conversely, if S is internal, then so are 
all T(n) = S — S(n), and C\keNT(k) = 0, hence there must exist an n G IN such 
that njJ =1 T(fc) = 0, which means that S = U% =1 S(k). □ 
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Corollary 4.2.3 Given an infinite sequence (S(n)) of internals sets, then S = 
HfceN'S'(^) is internal if and only if there exists an n G IN such that S = f]^ =1 S(k). 

Proof: By complementation from Corollary 4.2.1. □ 



4.3 Stirling's formula 

In order to provide still more evidence that nonstandard mathematics can be a 
very elegant substitute for classical mathematics in this section Stirling's formula 
for large factorials will be derived by nonstandard means. The argument closely 
follows that given in Van den Berg and Sari [27]. It takes the definition and 
properties of e as the base of the natural logarithm for granted, as well as those 
of 7r as the area of the unit circle, and that, 

/+oo 
exp(— x 2 )dx = y/n. 
-oo 

By definition, 

roc 

T(x) = / e'H^dt, x G *R, x > 0. 
Jo 

Also the existence of this integral is taken for granted. Let u be any positive 
hyperlarge element of *R, so that, 



T(u> + 1) = / e-H^dt. 



oo 

t J.LO , 







The integrand is increasing in the interval [0,u] and decreasing in the interval 
[u, oo), for which reason the variable t is replaced by u = (t — cu)/cu, giving, 



/oo 



so that the integrand now reaches its maximum at u — 0. It so happens that 
there exists a positive infinitesimal 8 such that the contributions of the integrand 
over the intervals [—1, —5] and [+5, oo) may be ignored, so that only the interval 
[—5, +5] need be taken into account. In other words, the 'mass' of the integrand is 
almost entirely concentrated in a hypersmall interval around zero. Instead of the 
infinitesimal 5 consider for the time being any d G *R, < d < 1, split [—1, oo) 
into [-1,-d], [-d,+d] and [+d, oo) and indicate the integrals of e -^+^ l °sa+ u ) 
over these subintervals by, 

/— d r+d roo 
, / , / , respectively. 
-1 J-d J+d 
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[+d, oo). Since the second derivative of u — log(l + u) is positive, it follows 
from Taylor's theorem that, 

u - log(l + u) > d - log(l + d) + (u- d)d/(l + d), 

giving, after replacing in the integrand the left-hand side of this inequality 
by its right-hand side, and evaluating the resulting integral, that, 

/■oo 

0< / <e-^ +wlog ( 1+d )(l + ci)/(u;d), 

J+d 

if - in view of the denominator ood — d is not an infinitesimal. Since — d + 
log(l + d) < it then further follows that, 

rco 

VmeN:0< < oj-' m . 



Let G = {d G *R : < d < 1, d is not an infinitesimal}, then since the 
positive infinitesimals form a halo, G is a galaxy. But, 

H = n mGN {rf e*R:0<d<l,0</ < uj~ m , 

J+d 

is a halo that clearly contains G, hence, by Fehrele's principle, H must 
contain a positive infinitesimal 5', such that, 



VmeN:0< < oj- m . 

J+6> 

Obviously, 5' may be replaced by a larger infinitesimal. 
[-1,-d]. Now, 

u - log(l + it) > -d - log(l - d) - (u + d)d/(l - d), 

and since d + log(l — d) < it follows similarly that, 

Vm G N : < ^ * < w"" 1 , 

if again d is not an infinitesimal, but nevertheless there must be a positive 
infinitesimal 5", that may be replaced by a larger one, such that, 

r-6" 

Vm G IN : < / < uo~ m . 

J-5" 

The details are left as an exercise. 
Letting 5 = max{5', 5"} it follows that, 

/*oo r — 6 

Vm G IN : < / + / < 2uj~ m , 

J+5 J-l 

showing that the contributions of the two 'tails' are extremely small, which 
is not yet to say that they may be ignored. 
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3) [-5, +5}. By Taylor's theorem, u-log(l+u) = {u 2 /2) / {l + 9u) 2 , for some 
0, < 9 < 1, so that u - log(l + u) = w 2 (l + e'(«))/2 for some e'(V) ~ 0. 
Replacing -u by t> = -Uy/cJ then gives, for some e(v) ~ 0, 



= / exp(-t; 2 (l + e(t;))/2)^. 

-<5 J—Sy/oj 

Here 5 is fixed such that 5y/uj ~ oo. Now let f(v) = exp(— v 2 (1 +e(v ))/2), 
g(t>) = exp(— v 2 /2), and /i(t> ) = exp(— f 2 /4), then from Theorem 4.1.3 it 
follows that, 

/+S r+oo I 

~ uj- 1/2 ■ / exp(-t; 2 /2)^ = J{2ir/u). 

Combining everything (in 2) it is sufficient to take m=l) finally leads to, 

um r(x + v = 1. 

™ x x e- x y/(2irx) 

□ 

For more general results the reader may consult Van den Berg [28] and Koudjeti 
[29]. 



4.4 Nonstandard mathematics without the axiom of 
choice? 

The preceding pages should have made it clear that nonstandard mathematics 
can be introduced in a way that is well-known to classical mathematicians. But 
logicians claim that from the point of view of logic and axiomatics when relat- 
ing, say, R to *R our naive approach obscures the insight into what is really 
happening. They are right, but nevertheless a naive approach could well be more 
understandable and also more acceptable, because there is no verdict on external 
sets, and because the axioms can easily be grasped (not even the Zermelo-Fraenkel 
axioms of set theory are necessary, requiring to look at natural numbers as sets 
and implying the unintended fact that there must be hyperlarge natural num- 
bers). Yet, one stumbling stone remains: the axiom of choice. Couldn't we do 
without? Let us try and see what happens if the same general line of thinking is 
followed but the underlying free ultrafilter U over IN is replaced by the Frechet 
filter F° (see Section 1.14). This means that again infinite sequences of classi- 
cal entities will generate entities that either are new or are identified with their 
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classical counterparts. It also means that we no longer follow ideas of Luxemburg 
and others, but follow instead those of Chwistek (see Section 1.9) and perhaps 
those of Cauchy, who does not seem to be very explicit, however, when it comes 
to defining infinitesimals. Anyway, it would have been difficult for Cauchy to base 
his informal treatment of the infinitesimals on a free ultrafilter U, because the 
free ultrafilter theorem (see the Appendix) was not known to him, and more- 
over the axiom of choice had in his time still to be 'invented'. Chwistek is much 
more explicit, but does not develop anything that could be appreciated as a fully 
fledged infinitesimal calculus. 

Clearly, Q G -F if and only if % G Q for % > n for some n G N. The latter will be 
rephrased as 'for i large enough'. 

As has been made clear in Section 1.10 in order to introduce *R (now with respect 
to F°) it will again be necessary to consider all infinite sequences of real numbers. 
Instead of, 

H( Si ) = H(U) if and only if {i : s t = U} G U (Section 2.2), 
the definition of equality will be, 

H(si) = H(ti) if and only if % is large enough, 
and (Section 2.3), 

H(si) = s if and only if Sj = s for % large enough, s 8 , s G R, 

Theorem 2.3.1 remains true, although the only-if part of the proof must be mod- 
ified: 

If {% : Si = Tj} G" F°, then either there is a subsequence (sj(j), j € N) such that 
Sj(j) G Si(j\, but Si(j) G" Ti(j\, or there is a subsequence with the roles of S and T 
reversed (or both). Assume the first case, and take Sj G Si arbitrarily if % is not 
an index of the subsequence. Then H(si) G S and H(si) G" T, i.e. S ^ T, and 
similarly for the other case. □ 

So, again, 

H( Si ) = {H( Si ) : s t G Si}, 
and Theorem 2.3.2 too remains valid. 

Also the introduction of internal pairs, n-tuples in general and functions does not 
cause difficulties. 

But with Theorem 2.4.2 the problems begin. Let S = {0, 1}, then S contains 
q = H(0, 1, 0, 1, . . .). Even though S is finite, *S ^ S, because q ^ and q ^ 1. 



138 



Moreover q is not hyperlarge, so what is it as an element of *N? Actually *S 
turns out to be infinite, and the conclusion is that *S contains far to many 
elements, i.e. the Frechet filter does not operate properly and allows too much 
to go through. But perhaps this is not really harmful. The survey at the end 
of Section 2.4 reveals more trouble, however, for although *0, * =, * G and *U 
are equal to or equivalent to 0, =, G and U, respectively, this is not true for: 
7^, - and c . In fact, each of the relevant equivalences or equalities must be 

replaced by an implication or an inclusion in the correct direction, as the reader 
can find out for her or himself. It follows that * 7^, * G", *D, *— and * c are all new 
relations or operations. Consequently, it follows from Section 2.6 (ignoring the 
implications of Section 2.5 on externality) that Los' theorem (Section 2.7) is no 
longer true, and that the same holds for transfer (Section 2.8). As one of the many 
counterexamples, consider H(Si)*UH(Ti) that is no longer equal to H(Si) UT(i). 

Section 2.7 reveals even more trouble: although *A is equivalent to A; *->, * V, * =>- 
and * <^> are not equivalent to -1, A, =>- and <^>, respectively. 

Finally, let us review the quantifiers. Although, given Xi and q, 

3H( Xi ) G H{Xi) : H( Xi ) = H(a) 

is equivalent to, 

H(3xi G Xi\ %i = q), 

the equivalence is in general invalid if the simple statement Xi = Ci is replaced by 
some other statement. Similar remarks apply to V. 

Also the definition of *R with R some binary relation is cumbersome, for suppose 
that S2iRt2i but -i(s2i-iRt2i-i) for all ieN, then H(siRti) is neither true nor 
false. Compare this to the example before with H(0, 1, 0, 1, . . .). So H(siRti) is 
not an ordinary statement, but an awkward internal something. 

Yet the definitions of * < and * >, for example, do not cause any problems, simply 
because < and > are not yet defined for hyperreal numbers, and the following 
case of transfer regarding continuity is legitimate, 

Vs G R,£ > : 35 G R, 5 > : \/x G R, | x - c |< 5 : | f(x) - /(c) |< e 

is equivalent to, 

Ve G *R, £ > : 35 G *R, 5 > : Vrr G *R, \x — c\<8:\ *f(x) - /(c) |< e. 

The definitions of infinitesimal and hyperlarge number can even be given a very 
simple form: 

e ~ if and only if e = H(Si) where (Si) converges to 0, 
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and, 

oj ~ oo if and only if uj = H(ui) where (cui) converges to + oo or to — oo. 
Remarks: 

1. It may well be that Cauchy's informal treatment comes closer to this kind 
of transfer, than to transfer with respect to some free ultrafilter U, but 
we will never know for sure. 

2. Compare the definition of uj ~ oo with Corollary 2.12.1. 

Even the following remains true: / is continuous at c G R if and only if, 
V5 G *R, 5 ~ : */(c + 5) - /(c) ~ 0. 

The proof of this equivalence is simple, because the latter statement is equivalent 

to, 

V(£i, ieN), tending to : (/(c + 6^ - /(c), i G N) tends to 0, 

which is equivalent to the continuity of / at c, and we are back at the plausible 
reasoning of Section 1.10. Note that Cauchy applied the simplified definition also 
to arbitrary c G *R, so that he must have used some sort of S'-continuity (see 
Section 3.3). In fact the e — 5 definition was introduced later on by Weierstrass. 

The conclusion must be that with F° instead of U nonstandard mathematics 
becomes a very restricted theory and nothing is left of the logician's equivalence 
ideal. More seriously to the ordinary mathematician, nothing is left of entirely 
new mathematical models that can be studied on the basis of a free ultrafilter and 
that cannot exist in classical mathematics, but nevertheless have turned out to be 
of great value not only within mathematics, but also outside it (such models have 
not been treated in this book). On the other hand what is left in the restricted 
theory can be based on well-known facts (such as the equivalence of (4.1) with 
ordinary continuity). 
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The proof of the theorem below is based on the axiom of choice. It is the only 
instance where this axiom is required for the theory of nonstandard mathematics, 
assuming that the corresponding classical mathematics does not require it. The 
axiom of choice can be shown to be equivalent to Zorn's lemma, stating that if 
each totally ordered subset of a partially ordered nonempty set E has an upper 
bound in E with respect to the implied order, then E has at least one maximal 
element. 

E is called partially ordered if there exists a binary relation p, called order relation 
or simply order, for some or all pairs of elements of E such that for a, b, c G E, 

1) if apb and bpc then ape, 

2) if apb and bpa then a = b, and 

3) apa for all a G E. 

A subset G of the partially ordered E with order p is totally ordered with respect 
to p if apb or bpa or both for all pairs (a, b), a,b G E, and m is a maximal element 
of E if [Va G -E : mpa] implies that apm and hence that a = m. 

The proof showing the equivalence of the axiom of choice and Zorn's lemma is 
by no means trivial, and can be found in several textbooks, e.g. Dunford and 
Schwartz [30]. 

Theorem Free ultrafilters over IN exist. 

Proof: By 'filter' will be meant 'filter over N'. Let F° be the Frechet filter, i.e. 
the set of the complements of all finite subsets of IN, so that Q G F° if and only 
if {n, n + 1, . . .} C Q for some n G N. Let E be the set of all filters F such that 
F D F°. Then E is nonempty and can be partially ordered by means of the order 
p, where apb if and only if a C b, i.e. the order is set inclusion. Let G be any 
totally ordered subset of E. Then, B = U{F : F G G} is an element of E, and B 
is an upper bound of G. Obviously B D F°, and that B is a filter can be seen as 
follows. 
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1) N G B, as N G F° C 5. 

2) G" 5, as F is the union of filters. 

3) If Q G B and N D R D Q, then Q G F for some F E B, hence R E F, so 
RGB. 

4) If Q, F G 5, then Q G F and R G F' for certain F, F' G G. As G is totally 
ordered F C F' or F' C F (or both). Assume F C F', then Q,R E F' , so 
Q n F G F' C F. 

That B is an upper bound for G is easily seen. 

According to Zorn's lemma F must contain a maximal element U. Since C/ D F°, 
U is a free filter. U is also an ultrafilter. For let Q E U be arbitrary. In order to 
show that either Q E U or Q c E U consider the following two cases. 

Case 1: Suppose WQ' E U : Q fl Q' is infinite. Let, 

V = {T : T C N, T D Q n Q' for some Q' G C/}. 

Then V is a filter. The verification of this statement is left as an exercise. Also 
V E F, for let Q' = {n,n + 1, . . .}, and C/ C V. By maximality V C U. But 
Q E V, hence Q Ell. 

Case 2: Suppose 3Q" G C/ : Q n Q" is finite. Then VQ' G C/ : Q c n Q' is infinite. 
To see this let Q' E U be arbitrary. Since both Q' and Q" belong to C/, also 
Q' n Q" G U and is infinite. Since QnQ'f] Q" is finite Q' n Q" - Q n Q' n Q" is 
infinite, i.e. Q c H Q' H Q" is infinite and so is Q c fl Q'- Now apply Case 1 with Q c 
instead of Q. □ 
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